Algebraic soft decoding of reed-solomon codes

ABSTRACT

An algorithmic soft-decision decoding method for Reed-Solomon codes proceeds as follows. Given the reliability matrix Π showing the probability that a code symbol of a particular value was transmitted at each position, computing a multiplicity matrix M which determines the interpolation points and their multiplicities. Given this multiplicity matrix M, soft interpolation is performed to find the non-trivial polynomial Q M (X,Y) of the lowest (weighted) degree whose zeros and their multiplicities are as specified by the matrix M. Given this non-trivial polynomial Q M (X,Y), all factors of Q M (X,Y) of type Y−ƒ(X) are found, where ƒ(X) is a polynomial in X whose degree is less than the dimension k of the Reed-Solomon code. Given these polynomials ƒ(X), a codeword is reconstructed from each of them, and the most likely of these codewords selected as the output of the algorithm. The algorithmic method is algebraic, operates in polynomial time, and significantly outperforms conventional hard-decision decoding, generalized minimum distance decoding, and Guruswami-Sudan decoding of Reed-Solomon codes. By varying the total number of interpolation points recorded in the multiplicity matrix M, the complexity of decoding can be adjusted in real time to any feasible level of performance. The algorithmic method extends to algebraic soft-decision decoding of Bose-Chaudhuri-Hocquenghem codes and algebraic-geometry codes.

REFERENCE TO A RELATED PATENT APPLICATION

The present utility patent application is descended from, and claims benefit of priority of, U.S. provisional patent application Serial No. 60/164,095 filed on Nov. 8, 1999, having the same name, and to the same inventors, as is the present application.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to decoding of error-correcting codes, and in particular, to efficient soft-decision decoding of Reed-Solomon and related codes.

The present invention particularly concerns an algorithmic soft-decision decoding procedure for Reed-Solomon codes, the procedure (i) making optimal use of the “soft-decision” reliability information available from a communication or storage system (channel), while (ii) executing in a time that scales polynomially with the length of the code. Said decoding procedure also extends to the decoding of Bose-Chaudhuri-Hocquenghem (BCH) codes and most algebraic-geometry (AG) codes; significantly outperforming the generalized minimum distance (GMD) decoding of Forney, Jr.

2. Description of Prior and Related Art

2.1. Reed-Solomon Codes

Reed-Solomon (RS) codes are among the most extensively used error-correcting codes today, and their practical importance is well established. See the book S. B. Wicker, V. K. Bhargava, Reed-Solomon Codes and Their Applications, IEEE Press, 1994, and in particular the chapters by K. A. S. Immink, Reed-Solomon codes and the compact disc, ibid, 1994; R. J. McEliece, L. Swanson, Reed-Solomon codes and the exploration of the solar system, ibid, 1994; S. B. Wicker, M. Bartz, Reed-Solomon codes in hybrid automatic repeat-request systems, ibid,. 1994; M. B. Pursley, Reed-Solomon codes in frequency-hop communications, ibid, 1994; D. Sarwate, Reed-Solomon codes and the design of sequences for spread-spectrum multiple-access communications, ibid, 1994; and E. R. Berlekamp, G. Seroussi, P. Tong, A hypersystolic Reed-Solomon decoder, ibid, 1994. See also E. R. Berlekamp, R. E. Peile, S. P. Pope, The application of error control to communications, IEEE Commun. Mag., vol. 25, pp. 44-57, 1987; W. W, Wu, D. Haccoun, R. E. Peile, Y. Hirata, Coding for satellite communication, IEEE J. Select. Areas Commun., vol. SAC-5, pp. 724-785, 1987; Consultative Committee for Space Data Systems, Recommendations for Space Data System Standards, Blue Book, 1984; M. B. Pursley, W. E. Stark, Performance of Reed-Solomon coded frequency-hop spread-spectrum communication in partial band interference, IEEE Trans. Commun., vol. COM-33, pp. 767-774, 1985; and D. Divsalar, R. M. Gagliardi, J. H. Yuen, PPM performance for Reed-Solomon decoding over an optical-RF relay link, IEEE Trans. Commun., vol. COM-32, pp. 302-305, 1984.

Reed-Solomon codes are used in digital audio disks (CD's) and digital versatile disks (DVD's). See S. B. Wicker, V. K. Bhargava, op. cit. and K. A. S. Immink, op. cit.

Reed-Solomon codes are used in satellite communications. See S. B. Wicker, V. K. Bhargava, op. cit., E. R. Berlekamp, R. E. Peile, S. P. Pope, op. cit., and W. W, Wu, et. al. op. cit.

Reed-Solomon codes are used in deep-space telecommunication systems, including multiple deep-space communication standards. See S. B. Wicker, V. K. Bhargava, op. cit., R. J. McEliece, L. Swanson, op. cit, and Consultative Committee for Space Data Systems, op. cit.

Reed-Solomon codes are used in frequency-hop and spread-spectrum systems. See M. B. Pursley, op. cit., D. Sarwate, op. cit, and M. B. Pursley, W. E. Stark, op. cit.

Reed-Solomon codes are used in error-control systems with feedback. See S. B. Wicker, V. K. Bhargava, op. cit. and S. B. Wicker, M. Bartz, op. cit.

Reed-Solomon codes are used in optical communications. See S. B. Wicker, V. K. Bhargava, op. cit. and D. Divsalar, et. al., op. cit.

2.2 Decoding of Reed-Solomon Codes

The decoding of Reed-Solomon codes is one of the most frequently performed tasks in today's communication and storage systems. Since the discovery of Reed-Solomon codes in the early 1960's (see I. S. Reed, G. Solomon, Polynomial codes over certain finite fields, J. Soc. Indust. Appl. Math., vol. 8. pp. 300-304, 1960), a steady stream of work has been directed towards streamlining their decoding algorithms. Today (circa 2000), very efficient techniques are available to accomplish that task. In particular, hard-decision decoders for RS codes have been implemented in hyper-systolic VLSI using algebraic decoding algorithms. Such decoders have been designed to operate at sustained data rates of 820 Mb/s. See E. R. Berlekamp, G. Seroussi, P. Tong, A hypersystolic Reed-Solomon decoder, op. cit.

An important problem in hard-decision decoding of Reed-Solomon codes is that of decoding beyond the error-correction radius (which is equal to one half of the minimum distance of the code). Search techniques have traditionally dominated approaches to this problem. See e.g., R. E. Blahut, Theory and Practice of Error Control Codes, Addison Wesley, 1994. However, for polynomially-bounded decoding complexity, these techniques do not achieve decoding beyond half the minimum distance of the code, in an asymptotic sense.

A breakthrough in this area was achieved by Sudan in 1997. Reference M. Sudan, Decoding of Reed-Solomon codes beyond the error correction bound, J. Compl. vol. 13, pp. 180-193, 1997; and also V. Guruswami, M. Sudan, Improved decoding of Reed-Solomon and algebraic-geometric codes, IEEE Trans. Inform. Theory, vol. 45, pp. 1755-1764, 1999. In the form presented in V. Guruswarni, M. Sudan op. cit., Sudan's algorithm corrects any fraction of up to τ≦1−{square root over (R)} erroneous positions in an RS code of rate R. Thus the error correction capabilities exceed the minimum distance error-correction bound (1−R)/2 for all rates in the interval [0,1]. The present invention makes use of the methods and algorithms developed by V. Guruswami, M. Sudan, op. cit.

2.3 Soft-Decision Decoding

Early in the development of coding theory, it was found convenient to represent communication channels as conveyors of symbols drawn from finite sets. The effects of channel noise were represented by the occasional (random) reception of a symbol other than the one that was transmitted. This abstraction of reality permitted the application of powerful algebraic and combinatoric tools to the code design and decoding problems; Reed-Solomon codes themselves were developed through this abstraction.

In reality, however, channel noise is almost always a continuous phenomenon. What is transmitted may be selected from a discrete set, but what is received comes from a continuum of values. This viewpoint leads to the soft-decision decoder, which accepts a vector of real samples of the noisy channel output and estimates the vector of channel input symbols that was transmitted. Alternatively, a soft-decision decoder can accept any quantization of the real samples of the noisy channel output. By contrast, the hard-decision decoder requires that its input be from the same alphabet as the channel input. It is now well known that soft-decision decoding techniques can provide up to 3 dB more coding gain for the additive white Gaussian channel.

A soft-decision decoder accepts analog values directly from the channel; the demodulator is not forced to decide which of the q possible symbols a given signal is supposed to represent. The decoder is thus able to make decisions based on the quality of a received signal. All of the information on the “noisiness” of a particular received signal is lost when the demodulator assigns a symbol to the signal prior to decoding. It has been estimated that this loss of information results in a 2 to 3 dB loss in performance.

2.4. Soft-decision Decoding of Reed-Solomon Codes

Soft-decision decoding of Reed-Solomon codes is different in many ways from hard-decision decoding. The advantage of soft-decision over hard-decision decoding is adequately established in many works. See, for example, J. G. Proakis, Digital Communications, New York: McGraw-Hill, 1983; C. G. Clark, J. B. Cain, Error Correction Coding for Digital Communication, New York: Plenum, 1981; and U. Cheng, G. K. Huth, Bounds on the bit error probability of a linear cyclic code GF(2^(i)) and its extended code, IEEE Trans. Inform. Theory, vol. 34, pp. 776-735, 1988. This later reference also contains graphs of soft-decision coding gain versus signal-to-noise ratio (SNR) for various RS codes.

The mainstream approach to the problem of soft-decision decoding was pioneered by Forney, and is known as generalized minimum-distance (GMD) decoding. See G. D. Forney, Jr., Generalized minimum distance decoding, IEEE Trans. Inform. Theory, vol. IT-12, pp. 125-131, 1966, and G. D. Forney, Jr., Concatenated Codes, MIT Press, 1966. The complexity of GMD decoding is moderate and ultimately is of the same order as the complexity of hard-decision decoding. See U. Sorger, A new Reed-Solomon decoding algorithm bases on Newton's interpolation, IEEE Trans. Inform. Theory, vol. IT-39, pp. 358-365, 1993; and also R. Kötter, Fast generalized minimum distance decoding of algebraic geometric and Reed-Solomon Codes, IEEE Trans. Inform. Theory, vol. 42, pp. 721-738, 1996. However, the gains that can be realized by GMD decoding are also moderate.

Even though the decoder can often be supplied with reliable soft-decision data relatively easily, the high complexity of optimal soft decoding makes utilization of such data prohibitive. In fact, all the available optimal soft decoding algorithms—e.g. the algorithm of Vardy and Be'ery (see A. Vardy, Y. Be'ery, Bit-level soft decision decoding of Reed-Solomon codes, IEEE Trans. Communications, vol. 39, pp. 440-445, March 1991) or its extensions (see e.g. V. Ponnampalam, B. S. Vucetic, Soft decision decoding of RS codes, in Proc. 13th AAECC Symposium on Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, Hawaii, USA, November 1999; S. Ray-Chaudhuri, A. H. Chan, Bit-level parallel decoding of Reed-Solomon codes, in Proc. 31st Allerton Conference on Communications, Control and Computing, Monticello, Ill., September, 1993)—run in time that scales exponentially with the length of the code. This makes the use of such algorithms generally infeasible in practice.

In this context, the reference of E. R. Berlekamp et. al., op. cit., states that “the major drawback with RS codes for satellite use is that the present generation of decoders do not make full use of bit-based soft-decision information.” In the same context, the reference of S. B. Wicker, V. K. Bhargava. op. cit., states that “the ‘Holy Grail’ of Reed-Solomon decoding research is the maximum-likelihood soft-decision decoder.” Thus, due to the ubiquity of RS codes in today's communication and recording systems, efficient soft decoding of RS codes, while making full use of the soft-decision reliability data, remains one of the most important tasks of coding theory and practice.

The present invention will be seen to concern a soft-decision Reed-Solomon decoding method and decoder that (i) makes full use of soft decision information and (ii) runs in time that scales polynomially with the length of the code. The soft-decision decoding method and decoder of the present invention will be also seen to be effective in operating on hard-decision input data.

BRIEF SUMMARY OF THE INVENTION

This invention provides a method for soft-decision decoding of error correction codes such as Reed-Solomon, BCH, and, in general, algebraic-geometric codes. Analog samples that have been received from a communication channel are processed to estimate the likelihoods of the symbols that were input to the communication channel. In accordance with one embodiment of the invention, reliability information concerning the relative likelihoods of different symbols is converted into a set of algebraic interpolation conditions. The interpolation conditions are used to produce a polynomial which, when factored, will yield factors that correspond to codewords. In accordance with a further embodiment of the invention, a vector of symbols received from the communication channel is used to compute a set of algebraic interpolation conditions. The computed set of algebraic interpolation conditions is interpolated to find an interpolation polynomial that satisfies the algebraic interpolation conditions. The interpolation polynomial is then factored to find factors that correspond to codewords of the error correcting code. In this manner, a list of candidate codewords is generated.

SUMMARY OF THE INVENTION

It is a primary object of the present invention to provide an efficient algorithmic method for soft-decision decoding of Reed-Solomon codes and related error-correcting codes.

More particularly, it is a primary object of the present invention to provide an algorithm that always runs in time that scales polynomially with the length of the code, while significantly outperforming (i) conventional hard-decision decoding, (ii) decoding beyond half the minimum distance, as taught by Guruswami and Sudan op. cit., and (iii) GMD soft-decision decoding. Specifically, the algorithmic method of the present invention provides substantially more coding gain than any of the above methods.

Reference is now made to FIG. 2, which should be considered exemplary only and not delimiting of the scope of the present invention. FIG. 2 is a graph showing a comparison of the performance of the method of the present invention for a simple concatenated Reed-Solomon coding scheme. An RS(256,144,113) code of length 256 and dimension 144 over a finite field with 256 elements is concatenated with a binary parity check code of length 9, so that the overall coding scheme represents a binary code of rate 1/2. The channel is an additive, white, Gaussian noise (AWGN) channel. The inner parity check code is in all cases decoded with a maximum a posteriori likelihood decoding algorithm. The four different curves of FIG. 2 correspond to the performance achieved by two hard-decision decoding algorithms and two soft-decision decoding algorithms. The two hard-decision decoding algorithms are (i) conventional Berlekamp-Welch 40 decoding up to half the minimum distance, and (ii) the algorithm by Guruswami and Sudan 41 that decodes beyond half the minimum distance. The two soft-decision decoding algorithms are (iii) Forney's GMD decoding 42 and (iv) the algebraic soft-decision decoding algorithm 43 of the present invention taught within this specification. It can be seen from FIG. 2 that the algorithmic method of the present invention significantly outperforms the other methods.

It is a related object of the present invention that the efficient, algorithmic method for the soft-decision decoding of Reed-Solomon codes should have additional useful features. Among the most useful features of the algorithmic method of the present invention is that there exists a trade-off between complexity and performance and, moreover, this trade-off can be chosen freely in real-time by adjusting a single parameter in the decoding procedure. Hence, the coding gain provided by the Reed-Solomon code can be traded for computational complexity in real-time. This is true in any application. For any desired level of performance, the complexity is bounded by a polynomial in the length of the code. In particular, the computational complexity can be adjusted to any required performance within a fundamental bound. (This bound is described in detail in Appendix A to this application.)

In still yet another object of the present invention that the efficient, algorithmic method of soft-decision decoding of Reed-Solomon codes should readily extend to the decoding of Bose-Chaudhuri-Hocquenghem codes (BCH) and most algebraic-geometry (AG) codes.

Briefly, to achieve the objects of the present invention, an algorithmic decoding method consisting of three major steps is disclosed. For much later reference, let _(q)={α₁,α₂, . . . , α_(q)} denote a finite field with q elements, where q is a nonnegative integer power of a prime integer. Given that a codeword c=(c₁,c₂, . . . , c_(n)) of a Reed-Solomon code over the alphabet _(q)={α₁,α₂, . . . , α_(q)} is transmitted over a channel, and a vector y=(y₁,y₂, . . . , y_(n)) is observed at the channel output, the input to the algorithm consists of the q×n soft-decision reliability matrix Π whose entries are

π_(i,j) =Pr(c _(j)=α_(i) |y _(j))

where Pr(c_(j)=α_(i)|y_(j)) is the probability that the j-th transmitted symbol was α_(i) given that the j-th observed symbol y_(j). The three major steps of the algorithmic method are as follows:

1. Determination of Interpolation Points: Given the reliability matrix Π, compute the multiplicity matrix M which determines the interpolation points and their multiplicities.

2. Soft Interpolation: Given the multiplicity matrix M from the previous step, find the non-trivial polynomial Q_(M)(X,Y) of the lowest (weighted) degree whose zeros and their multiplicities are as specified by the matrix M.

3. Factorization Step: Given the non-trivial polynomial Q_(M)(X,Y) from the previous step, find all factors of Q_(M)(X,Y) of type Y−ƒ(X), where ƒ(X) is a polynomial in X whose degree is less than the dimension k of the Reed-Solomon code. Given the polynomials ƒ(X), reconstruct from each of them a codeword of the RS code.

These three steps can be performed either sequentially or in a pipeline. The overall output of the algorithmic method are the codewords resulting from the factorization step. Each of the three steps may be realized and obtained as described below and in the detailed specification.

1. An Algebraic Soft-decision Decoding Method

It should first be understood as background to understanding the present invention that the word “soft” occurring in the phrase “soft-decision decoding of Reed-Solomon codes” means something other, and different, than the more common, and older, hard-decision decoding of Reed-Solomon codes. As explained in the Background section of this specification, a soft-decision decoder accepts a vector of real samples of a noisy channel output, which samples correspond to the reliabilities of possible input symbols, and estimates the vector of channel input symbols that was transmitted.

There is but little prior art of direct relevance to the present invention being that soft decoding is considerably different from, and advanced over, standard (“hard”) Reed-Solomon decoding. All previous art regarding standard hard-decision RS decoding is somewhat distant from the present invention. On the other hand, the prior art of direct relevance to the present invention is relatively new: The method of the present invention builds on, and significantly extends upon, an algebraic interpolation algorithm proposed recently by V. Guruswami, M. Sudan op. cit. (for hard-decision decoding).

Accordingly, in one of its aspects the present invention will be recognized to be embodied in a method of decoding an error-correction code wherein the method is algebraic, i.e., algebraic soft-decision decoding. In the context of this invention, algebraic soft-decision decoding may be understood as a decoding method that proceeds by converting reliability information concerning the relative likelihoods of different symbols into a set of algebraic interpolation conditions. A set of algebraic interpolation conditions may be understood, for example, as a system of equations over a finite field. Although soft-decision decoding of various codes has previously been performed (see E. R. Berlekamp, Soft Decision Reed-Solomon Decoder, U.S. Pat. No. 4,821,268, issued Apr. 11, 1989; E. R. Berlekamp, Bounded distance+1 soft-decision Reed-Solomon decoding, IEEE Trans. Inform. Theory, vol. 42, pp. 704-721, 1996; N. Kamiya, A sufficient condition for a generalized minimum distance, Reed-Solomon decoder to ensure correct decoding, IEICE Transactions Fundamentals, vol. E80-A, pp. 2073-2088, 1997; M. S. Oh, P. Sweeney, Bit-level soft-decision sequential decoding for Reed Solomon codes, Workshop on Coding and Cryptography, Paris, France, January 1999; V. Ponnampalam, B. S. Vucetic, Soft decision decoding of Reed-Solomon codes, in Proc. 13th Symp. Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes, Honolulu, Hi., USA, November 1999; S. Ray-Chaudhuri, A. H. Chan, Bit-level parallel decoding of Reed-Solomon codes, in Proc. 31st Allerton Conference on Communications, Control and Computing, Monticello, Ill., September 1993; S. K. Shin, P. Sweeney, Soft decision decoding of Reed-Solomon codes using trellis methods, IEE Proc.-Commun., vol. 141, pp. 303-308, 1994; S. K. Shin, P. Sweeney, Evaluation of efficient trellis methods for soft decision decoding of Reed-Solomon codes, IEE Proc.-Commun., 1996, vol. 143, pp. 61-67, 1996; D. J. Taipale, M. J. Seo, An efficient soft-decision Reed-Solomon decoding algorithm, IEEE Trans. Inform. Theory, vol. 40, pp. 1130-1132, 1994; A. Vardy, Y. Be'ery, Bit-level soft-decision decoding of Reed-Solomon codes, IEEE Trans. Commun., vol. 39, pp. 440-445, 1991), it has never been algebraically so performed, as was previously discussed. The error-correction code to be decoded may in particular be a code of the Reed-Solomon, BCH, or algebraic-geometry types.

Alternatively, the present invention may be considered to be embodied in a method of decoding an error-correction code that is characterized by transpiring in polynomial time. Again, and although optimal soft-decision decoding has previously been performed, it has never heretofore been so performed in polynomial time, again as was previously discussed. The code may again, in particular, be a code of the Reed-Solomon, BCH, or algebraic-geometry types.

In greater detail, in the preferred embodiment of the present invention a q×n reliability matrix Π with elements π_(i,j) is received upon a communications channel. The alphabet is of size q, and n is the length of the Reed-Solomon code.

The preferred algebraic soft-decision decoding method, in accordance with the present invention, then consists of (i) computing a multiplicity matrix M, (ii) interpolating so as to find a non-trivial polynomial which passes through the points specified by M, (iii) factorizing the non-trivial polynomial into factors that correspond to codewords, and (iv) selecting a single codeword from among these factors.

Namely, (i) a multiplicity matrix M is found that has non-negative integer entries m_(i,j) corresponding to the received reliability matrix Π, which entries m_(i,j) specify interpolation points.

Then, given this multiplicity matrix M, (ii) interpolating is performed so as to find a non-trivial polynomial Q_(M)(X,Y) of lowest (1,k−1)-weighted degree such that Q_(M)(X,Y) has a zero of multiplicity at least m_(i,j) in the point (x_(j),α_(i)).

Then, given this non-trivial polynomial Q_(M)(X,Y), (iii) the polynomial is factored so as to find factors of type Y−ƒ(X), deg(ƒ)<k, which factors correspond to codewords.

Then, given these codewords, (iv) a single codeword, most suitable to reconstitute from the received reliability matrix Π the data that was originally transmitted upon the communications channel, is selected.

In still greater detail, the (i) finding the multiplicity matrix M preferably transpires by a process of:

1) commencing with an initial reliability matrix Π*=Π as received, and with an empty all-zero matrix M into which the interpolation points will be recorded, and with an integer s representing the total number of interpolation points,

2) finding a position (i,j) of the largest element in Π*,

3) deriving π*_(i,j)←π_(i,j)/(m_(i,j)+2); m_(i,j)←m_(i,j)+1; and s←s−1

4) deciding if s=0, and if so then outputting M, else returning to performing the 2) finding and the 3) deriving and the 4) deciding.

2. Operation to the Algorithmic Method of the Present Invention in Determining the Interpolation Points, by Example

The algorithmic method of the present invention will be taught, and proved, within this specification disclosure in the formal language of mathematics, which language of mathematics also serves to rigorously define the invention. However, and at the risk of oversimplification, it is possible to illustrate, by way of a simple example, one of the main components of the present invention, thereby according insight into how the invention works.

An example of the present invention that substantially illustrates the main idea behind the determination of interpolation points (which example is later again set forth in Section 2 of the Description of the Preffered Embodiment part of this specification disclosure, although at that point almost exlusively in mathematical language) is as follows.

Suppose there are five positions in an exemplary set, and each position can take five possible values i.e., {1, 2, 3, 4}. Suppose a codeword c=(1, 2, 3, 4, 0) is sent. For later reference, we make note now, in passing, that this particular sent codeword c has a very easy, and straightforward, relationship to its positions: namely, if a position i of the sent codeword c is represented as x_(i), and the value of that position i represented as y_(i), then y₁=x_(i)+1 modulo 5. Similarly, any sent codeword can be represented as a polynomial expression. Mathematically, this is expressed as:

₅(5,2)={(ƒ(0), ƒ(1), ƒ(2), ƒ(3), ƒ(4)):=ƒ(X)a+bX, a,b∈ ₅}

where ₅(5,2) is Reed-Solomon code of length 5 and dimension 2 over the field ₅=₅. Accordingly, there are q^(k)=5²=25 codewords, corresponding to 25 different polynomials, as shown in the following table:

a b f(x) (c₀,c₁,c₂,c₃,c₄) 0 0 0 (0, 0, 0, 0, 0) 1 0 1 (1, 1, 1, 1, 1) 2 0 2 (2, 2, 2, 2, 2) 3 0 3 (3, 3, 3, 3, 3) 4 0 4 (4, 4, 4, 4, 4) 0 1 X (0, 1, 2, 3, 4) 1 1 1 + X (1, 2, 3, 4, 0) 2 1 2 + X (2, 3, 4, 0, 1) 3 1 3 + X (3, 4, 0, 1, 2) 4 1 4 + X (4, 0, 1, 2, 3) 0 2 2X (0, 2, 4, 1, 3) 1 2 1 + 2X (1, 3, 0, 2, 4) 2 2 2 + 2X (2, 4, 1, 3, 0) 3 2 3 + 2X (3, 0, 2, 4, 1) 4 2 4 + 2X (4, 1, 3, 0, 2) 0 3 3X (0, 3, 1, 4, 2) 1 3 1 + 3X (1, 4, 2, 0, 3) 2 3 2 + 3X (2, 0, 3, 1, 4) 3 3 3 + 3X (3, 1, 4, 2, 0) 4 3 4 + 3X (4, 2, 0, 3, 1) 0 4 4X (0, 4, 3, 2, 1) 1 4 1 + 4X (1, 0, 4, 3, 2) 2 4 2 + 4X (2, 1, 0, 4, 3) 3 4 3 + 4X (3, 2, 1, 0, 4) 4 4 4 + 4X (4, 3, 2, 1, 0)

Now, assume that the sent the sent codeword is received as a vector y such that the 5×5 matrix Π with elements π_(i,j)=Pr(c_(j)≡i|y_(j)) reads:

j i 0 1 2 3 4 0 0.01 0.0025 0.05 0.14 0.20 1 0.06 0.0025 0.09 0.21 0.05 2 0.02 0.9900 0.15 0.07 0.20 3 0.01 0.0012 0.61 0.44 0.40 4 0.90 0.0038 0.10 0.21 0.15

This matrix Π is called the reliability matrix; it shows the probability that a code symbol of a particular value was received at each position of the code. The reliability matrix can be computed from the receive vector y using the method taught in Section 2 of the Description of the Preferred Embodiment part of this specification.

Although some positions of the exemplary reliability matrix Π show a high probability that some one particular code value was indeed received in a particular position (with correspondingly low probabilities that any other value should have been received in that position), other positions of the same matrix show only but low confidence in the value of the received symbol. For example, the 0.9900 probability value at position (2,1) of the reliability matrix Π indicates a very high probability that the value 2, and no other, was received at position 1. Conversely, the 0.40 probability value at position (3,4) of the reliability matrix Π indicates only but slight confidence that the value 3, and no other, was received at position 4.

In classic Reed-Solomon “hard” decoding, a decision is derived from each column of the reliability matrix Π. Interpreting the maximum value in each of the five columns, as explained above, the sent codeword c is interpreted to be (4,2,3,3,3). This represents three errors (out of five positions) relative to the codeword (1,2,3,4,0) that was actually sent. The communications channel upon which the codeword c was sent is apparently very noisy.

However, the present invention will show how the codeword actually sent may be fully accurately recovered in this example. First, the reliability matrix Π contains a great deal of information. Given the information in Π, it is clear that we would like to weigh each position differently. This can be done by requiring that a polynomial Q_(M)(X,Y) should pass through different points with different multiplicities. This is akin to finding the equation of a curve that passes through given discrete points on a plane. If the equation of such a curve can be found, then it contains information regarding all the points.

As discussed above, the three major steps of the algorithm, in accordance with the present invention, to recover the codeword are: 1) Determination of Interpolation Points, 2) Soft Interpolation, and 3) Factorization Step. The first step (1) Determination of Interpolation Points is now illustrated by example.

The determination consists of computing entries in the multiplicity matrix M by an iterative and recursive process. Mathematically the process commences with (i) an initial reliability matrix Π*=Π as received from a communications channel, (ii) an empty all-zero matrix M into which the entries will be recorded, and (iii) an integer s representing the total number of interpolation points.

Next, a position (i,j) of the largest element in Π* is found. Next, each of

(i) π*_(i,j)←π_(i,j)/(m _(i,j)+2); (ii) m _(i,j) ←m _(i,j)+1; and (iii) s←s−1,

are derived. Finally, it is decided if s=0, and if so then M is outputted, else return is made to performing the finding and the deriving and the deciding.

An easy way to understand what this means is by example. Let s, or the number of iterations to be made, equal 7. This number s is arbitrary: it will later be discussed that the number of iterations in the algorithm of the present invention may be traded-off for accuracy in the recovery of RS codewords. It will later be found that, for the noise experienced on this channel in the example that follows, 7 iterations will suffice for accurate codeword recovery. More iterations could be performed, but are unnecessary.

The initial, empty and all zero matrix M is: $\begin{matrix} \quad & \quad & M & \quad & \quad \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{matrix}$

Displaying this matrix M, and the initial exemplary reliability matrix Π*=Π together we have:

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.9900 0.15 0.07 0.20 0 0 0 0 0 0.01 0.0012 0.61 0.44 0.40 0 0 0 0 0 0.90 0.0038 0.10 0.21 0.15 0 0 0 0 0

Now, by the process of the present invention, (i) the single largest numerical probability in the matrix Π* is located, (ii) the corresponding position of the matrix M is augmented by +1, and (iii) this single largest probability in the matrix Π is divided by the augmented value +1 (or in this case, by the number 2) and replaced in the matrix Π*. For matrix M, and matrix Π*, we then get the following (changed positions are shown in boldface for convenience of reference):

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.4950 0.15 0.07 0.20 0 1 0 0 0 0.01 0.0012 0.61 0.44 0.40 0 0 0 0 0 0.90 0.0038 0.10 0.21 0.15 0 0 0 0 0

The process continues: locate the now-highest probability in the matrix Π*, and adjust it and the corresponding position of the matrix M. Thus the next, second, iteration produces:

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.4950 0.15 0.07 0.20 0 1 0 0 0 0.01 0.0012 0.61 0.44 0.40 0 0 0 0 0 0.45 0.0038 0.10 0.21 0.15 1 0 0 0 0

Likewise, the third iteration produces:

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.4950 0.15 0.07 0.20 0 1 0 0 0 0.01 0.0012 0.305 0.44 0.40 0 0 1 0 0 0.45 0.0038 0.10 0.21 0.15 1 0 0 0 0

Consider that in the fourth iteration the division of the original probability π_(2,2) is now by 3, and no longer by 2:

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.3300 0.15 0.07 0.20 0 2 0 0 0 0.01 0.0012 0.305 0.44 0.40 0 0 1 0 0 0.45 0.0038 0.10 0.21 0.15 1 0 0 0 0

Doubtless by now, the mathematical expression of this process, which is given above, will be better understood. The process continues:

Π* M 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.3300 0.15 0.07 0.20 0 2 0 0 0 0.01 0.0012 0.305 0.44 0.40 0 0 1 0 0 0.30 0.0038 0.10 0.21 0.15 2 0 0 0 0 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.3300 0.15 0.07 0.20 0 2 0 0 0 0.01 0.0012 0.305 0.22 0.40 0 0 1 1 0 0.30 0.0038 0.10 0.21 0.15 2 0 0 0 0 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.3300 0.15 0.07 0.20 0 2 0 0 0 0.01 0.0012 0.305 0.22 0.20 0 0 1 1 1 0.30 0.0038 0.10 0.21 0.10 2 0 0 0 0 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.2475 0.15 0.07 0.20 0 3 0 0 0 0.01 0.0012 0.305 0.22 0.20 0 0 1 1 1 0.30 0.0038 0.10 0.21 0.15 2 0 0 0 0 0.01 0.0025 0.05 0.14 0.20 0 0 0 0 0 0.06 0.0025 0.09 0.14 0.05 0 0 0 0 0 0.02 0.2475 0.15 0.07 0.20 0 3 0 0 0 0.01 0.0012 0.1525 0.22 0.20 0 0 2 1 1 0.30 0.0038 0.10 0.21 0.15 2 0 0 0 0

Eventually, the iteration count s having decreased to zero, the iterative process terminates. The final reliability matrix M, given immediately above, can alternatively be expressed as:

point (x,y) (1,2) (0,4) (2,3) (3,3) (4,3) multiplicity 3 2 2 1 1

Importantly to understanding the scope of the present invention, from this point forward, certain mathematical processes employed are substantially conventional. Namely, using conventional interpolation methods, the final multiplicity matrix M is interpreted so as to find a non-trivial polynomial Q_(M)(X,Y) of lowest (1,k−1)-weighted degree such that Q_(M)(X,Y) has a zero of multiplicity at least m_(i,j) at the point (α_(i),x_(j)).

For the derived multiplicity matrix M above, this polynomial Q_(M)(X,Y) turns out to be:

Q _(M)(X,Y)=1+X+Y−X ² −Y ²−2X ² Y+Y ² X−Y ³ +X ⁴−2YX ³ −X ² Y ²+2Y ³ X.

Continuing, this polynomial is factorized into factors of type Y−ƒ(X), where deg(ƒ)<k. This factoring, while non-trivial, is mathematically well understood and is, for example, taught in the prior art reference of Guruswami and Sudan, op. cit.

In accordance with the present invention, these factors correspond to potential codewords. Given these potential codewords, a single new codeword—suitable to reconstitute from the received reliability matrix Π the data that was originally transmitted upon the communications channel—is selected. Continuing with the example, the derived polynomial Q_(M)(X,Y) factors as:

Q _(M)(X,Y)=(Y−X−1)(Y+1+2X)(Y+1+3XY+3X+3X ²).

We identify the two solutions (Y−X−1) and (Y+1+2X) as corresponding to tentative codewords. Of these two solutions, the solution (Y−X−1), or, equivalently, Y=X+1, corresponds to the sent codeword. It may now be recalled that y=x+1 was indeed the relationship between the bit positions x_(i) and y_(i) of the sent codeword c.

The mathematical process of the present invention has thus sufficed to precisely recover the Reed-Solomon codeword transmitted upon a channel. This has efficiently transpired, in polynomial time, nonetheless that the channel is so noisy that less than half, or forty percent (40%), of the codeword was received correctly.

3. Features of the Present Invention

As a more restrictive characterization of the present invention, the invention will be recognized to be embodied in method of soft-decision decoding a forward error-correction code of the Reed-Solomon, BCH, or algebraic-geometry types, wherein the method is both (i) algebraic, and (ii) transpiring in polynomial time.

The decoding method is characterized in that the magnitude of a coding gain provided by the Reed-Solomon code can be traded-off for computational complexity; relatively more coding gain being realized at relatively more computational complexity while relatively less coding gain is realized at relatively less computational complexity. In this method, the computational complexity is always bounded by a polynomial function of the length of a codeword of the Reed-Solomon code.

If this method of decoding a Reed-Solomon code (or a BCH subfield subcode of the Reed-Solomon code) is applied on a q-ary symmetric communication channel then, as the number of interpolation points becomes very large, it will produce a list of candidate codewords containing the particular codeword at the input to the q-ary symmetric communications channel, provided that less than t errors have occurred, further provided that τ=t/n satisfies the inequality ${\left( {1 - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} > R$

where q is the number of symbols of the symmetric communication channel, and R is the rate of the Reed-Solomon code.

Similarly, if this method of decoding of a Reed-Solomon code (or a BCH subfield subcode of the Reed-Solomon code) is applied on a q-ary symmetric erasure communication channel then, as the number of interpolation points becomes very large, it will produce a list of candidate codewords containing the particular codeword at the input to the q-ary symmetric erasure communication channel, provided that less than t errors have occurred, further provided that the fraction τ=t/n satisfies the inequality: ${{\frac{1}{1 - \sigma}\left( {\left( {1 - \tau - \sigma} \right)^{2} + \frac{\tau^{2}}{q - 1}} \right)} + \frac{\sigma}{q}} \geq R$

where q is the number of symbols of the q-ary symmetric erasure communication channel, not counting the symbol φ denoting erasure, σ is the number of erasures that occurred, and R is the rate of the Reed-Solomon code.

Similarly, if this method of decoding a Reed-Solomon code (or a BCH subfield subcode of the Reed-Solomon code) is applied on a q-PSK communication channel then, as the number of interpolation points becomes very large, it will produce a list of candidate codewords containing the particular codeword at the input to the q-PSK channel, provided that less than s errors to the nearest neighbors and less than t errors beyond the nearest neighbors have occurred, further provided that the fractions σ=s/n and τ=t/n satisfy the inequality: ${\left( {1 - \tau - \sigma} \right)^{2} + \frac{\tau^{2}}{q - 3} + \frac{\sigma^{2}}{2}} \geq R$

where q is the number of symbols of the q-PSK communication channel and R is the rate of the Reed-Solomon code.

4. Aspects and Embodiments of the Present Invention

Accordingly, in one of its aspects, the present invention may be considered to be embodied in a method of decoding an error-correcting code. The error-correcting code has a number of codewords, with each codeword having multiple symbols. The error-correcting code is of the Reed-Solomon type and/or of the Bose-Chaudhuri-Hocquenghem (BCH) type.

The decoding method constitutes algebraic soft-decision decoding by act of converting reliability information concerning the relative likelihoods of the different code symbols into algebraic interpolation conditions. Ergo, by its reliance on the “soft” reliability information, the method is called a soft-decision decoding method.

In accordance with the present invention, data inputs to the algebraic soft-decision decoding method commonly include reliability information concerning the relative likelihoods of different code symbols and/or data from which the reliability information concerning the relative likelihoods of different code symbols can be computed. With these inputs, the algebraic soft-decision decoding consists essentially of three steps.

First, given the reliability information, computing from this reliability information a set of algebraic interpolation conditions.

Next, given the computed algebraic interpolation conditions, interpolating to find a non-trivial interpolation polynomial that satisfies these conditions.

Then, given the interpolation polynomial, factoring the polynomial to find factors that correspond to codewords of the Reed-Solomon code or the BCH code, therein generating a list of candidate codewords.

In more formal, mathematical terms, the algebraic soft-decision decoding method may be considered to be applied to an (n,k,d) Reed-Solomon code over a finite field _(q) with q elements, or, equivalently (for the operation of the present invention) and alternatively, an (n,k′,d′) BCH code over a finite field _(q′) with q′ elements. In these expressions n≧1 is the length, k≦n is the dimension, and d=n−k+1 is the minimum distance of the Reed-Solomon code, while n is the length, k′≦k is the dimension, and d′≧d is the minimum distance of the BCH code, the BCH code being but a subfield subcode of the Reed-Solomon code. For each of the n code symbols of the Reed-Solomon code, the input to the method of the present invention includes reliability information concerning the relative likelihood of each code symbol being equal to a α∈_(q) for each element α of the finite field _(q), and/or data from which such reliability information can be computed.

With these inputs, the three steps of the algebraic soft-decision decoding method are as follows:

First, given the reliability information, non-negative integer-valued numbers m_(i,j) are computed for each position j in the code, where 1≦j≦n, and for each element α_(i)∈_(q), where 1≦i≦q, so that the total number of non-zeros among the non-negative integer-valued numbers m_(i,j) depends on the reliability information.

Second, given these numbers m_(i,j), a (non-trivial) polynomial Q_(M)(X,Y) is found by interpolating, so that Q_(M)(X,Y) has a zero of multiplicity at least m_(i,j) at the point (x_(j),α_(i)) for each j in the range 1≦j≦n and each i in the range 1≦i≦q, this polynomial Q_(M)(X,Y) best having the least possible (1,k−1)-weighted degree. (By “best having” it is meant only that the method works faster to produce better results when the polynomial Q_(M)(X,Y) so has the least possible (1,k−1)-weighted degree; the method will still work even if the polynomial Q_(M)(X,Y) does not so have the least possible (1,k−1)-weighted degree.)

Third, given the polynomial Q_(M)(X,Y), this polynomial is factored to find factors of type Y−ƒ(X), where the degree of the polynomial ƒ(X) is less than k, each factor corresponding to a codeword of the Reed-Solomon code or the BCH code, thereby generating a list of candidate codewords.

In the preferred methods, each of the computing and the interpolating and the factoring have a computational complexity that is always bounded by a polynomial function of the length n of the Reed-Solomon code and/or the BCH code. Accordingly, the entire algebraic soft-decision decoding has a computational complexity that is always bounded by a polynomial function of the length n of the Reed-Solomon code and/or the BCH code.

Any of the computing, the interpolating, and the factoring may be implemented in special purpose digital logic, including digital signal processing (DSP), circuits, and/or by software in a general purpose digital computer.

The method of the present invention is most often used in concatenated coding schemes, multilevel coding schemes, and combinations thereof. It may be, for example, used in a communication system, in a recording system, or in multitudinous other systems in which error-correcting codes are employed.

In another of its aspects, the present invention may be considered to be embodied in a method of decoding a Reed-Solomon error-correcting code of length n and rate R over a finite field _(q) with q elements, or, equivalently, a Bose-Chaudhuri-Hocquenghem (BCH) error-correcting code of length n and rate R′ over the finite field _(q), on a channel of type drawn from the group consisting of q-ary symmetric channels, q-ary symmetric erasure channels, and q-PSK channels.

In such an environment, the input to the method comprises data that includes a vector y=(y₁,y₂, . . . , y_(n)) observed at the channel output, where all the n elements of the vector y belong to the finite field _(q) in the case of q-ary symmetric channels and q-PSK channels, but, in the case of q-ary symmetric erasure channels, all the n elements of the vector y belong to _(q)∪φ, the special symbol φ denoting erasure.

The method essentially consists of three steps.

First, given the vector y=(y₁,y₂, . . . , y_(n)), from this vector y a set of algebraic interpolation conditions is computed.

Second, the computed set of algebraic interpolation conditions is interpolated to find a (non-trivial) interpolation polynomial that satisfies the algebraic interpolation conditions.

Third, given the interpolation polynomial, this polynomial is factored to find factors that correspond to codewords of the Reed-Solomon code or the BCH code. Thus a list of candidate codewords is generated.

Each of the computing, the interpolating and the factoring has computational complexity that is always bounded by a polynomial function of the length of the Reed-Solomon code and/or the BCH code.

The present invention may be thus characterized by its application to Reed-Solomon error-correcting codes, to BCH error-correcting codes, and to algebraic-geometric codes in general. Notably, the present invention can be also applied to the output of a hard-decision channel. Since no probability accompanies the output of a hard-decision channel, the substantial power of the soft-decision decoding of the present invention is wasted, but the present invention will still operate to decode RS, BCH, and algebraic-geometric error-correcting codes, while outperforming all the existing hard-decision decoding methods.

The present invention is thus better defined by what it does than exactly (i) what it operates on, as inputs, and, indeed, (ii) what it produces, as output. In illustration of (ii), the output can be considered to be the list of candidate codewords, or, by selection in a fourth step, some subset of these codewords (which may include a subset of one, meaning one codeword). In all cases, the essence of the present invention is that it is converting reliability information concerning the relative likelihoods of the different code symbols into algebraic interpolation conditions, and then proceeding to produce from the interpolation conditions a polynomial which, when factored, will deliver up factors that correspond to codewords. Ergo, the soft-decision decoding of the present invention uses reliability information—commonly heretofore discarded—to decode Reed-Solomon, BCH, and algebraic-geometric codes.

5. Applications and Implementations of the Present Invention

In still yet another of its aspects, the present invention may be used in a concatenated coding scheme, wherein a maximum a posteriori likelihood (MAP) algorithm, such as the BCJR decoding algorithm (see L. R. Bahl, J. Cocke, F. Jelinek, J. Raviv, Optimal decoding of linear codes for minimizing symbol error rate, En; Trans. Inform. Theory, vol. 20, pp. 284-287, 1974), is preferably employed to decode the inner code(s). The present invention may be also used in a multilevel coding scheme, or in a coding scheme that combines multilevel coding with concatenated coding.

The method of the present invention may be implemented in either software or hardware or any combination thereof. In particular, any one of the computation (determination of interpolation points), interpolation, and factorization steps of the decoding method may be implemented in either special purpose digital logic or by software in a general purpose digital computer.

The present invention may be used in a communication system, such as a cable-modem, an ISDN line, a satellite telecommunications link, a wireless communications (multi-user) system, and many others. The present invention may be also used in a recording system, such as a magnetic disk (read and write channels), a magnetic tape, a compact disk, a DVD, as well as other magnetic, optical, and holographic storage media.

The following illustrates a possible implementation of the present invention in a communication system 20. The communication system has a source of a data 22, a Reed-Solomon encoder 23 of the data producing a Reed-Solomon codeword, and a decoder 24, as illustrated in FIG. 1.

The decoder includes (1) a re-encoder 26 receiving the reliability matrix Π and producing an estimated received Reed-Solomon codeword, the re-encoder supplying this Reed-Solomon codeword to each of (2) an interpolator 28, also receiving chosen interpolation points from an interpolation point chooser 30 to produce a function Q_(M)(X,Y), (3) a factorizer 32 also receiving the function Q_(M)(X,Y) from the interpolator to produce a list of candidate Reed-Solomon codewords, (4) an interpolation point chooser operating with the reliability matrix Π, and (5) a selector 34, also receiving the candidate codewords from the factorizer, for selecting a subset of codewords as does best permit the reconstituting of the sent data from the received matrix Π.

The (4) interpolation point chooser records points in the multiplicity matrix M by process of—as was stated before—(1) commencing with an initial reliability matrix Π*=Π as received, and with an empty all-zero matrix M into which the points will be recorded, and with an integer s representing the number of interpolation points, (2) finding a position (i,j) of the largest element in Π*, (3) deriving π*_(i,j)←π_(i,j)/(m_(i,j)+2); m_(i,j)←m_(i,j)+1; and s←s−1, and (4) deciding if s=0, and if so then outputting M, else returning to performing the (2) finding and the (3) deriving and the (4) deciding.

These and other aspects and attributes of the present invention will become increasingly clear upon reference to the following drawings and accompanying specification.

BRIEF DESCRIPTION OF THE DRAWINGS

Referring particularly to the drawings for the purpose of illustration only and not to limit the scope of the invention in any way, these illustrations follow:

FIG. 1 is a diagrammatic view of a system 20 performing algebraic soft-decision decoding of Reed-Solomon codes in accordance with the present invention.

FIG. 2 is a graph showing a comparison for a particular simple concatenated coding scheme, where a RS (256,144,113) code over ₂₅₆ is concatenated with a binary parity check code of length 9 so that the coding scheme corresponds to a binary code with rate 1/2, the channel being an additive white Gaussian noise channel while the inner parity check code is in all cases decoded with a maximum a posteriori likelihood decoding algorithm: the different curves 40-43 correspond to the performance achieved by two hard-decision decoding algorithms (decoding up to half the minimum distance 40 and the algorithm by Guruswami and Sudan 41), and two soft-decoding algorithms (Forney's GMD 42 decoding and the algebraic soft-decision decoding algorithm 43 of the present invention taught within this specification).

FIG. 3 is a graph showing the performance of binary BCH codes of length 63, on a BPSK-modulated additive white Gaussian noise (AWGN) channel, under the algebraic soft-decision decoding method of the present invention. The plurality of curves 50-60 show the probability that the transmitted codeword is not on the list produced by (the factorization step in) the algebraic soft-decision decoder, for a large number of interpolation points (s→∞). Said probability is plotted versus the SNR and versus the dimension of the corresponding BCH codes.

FIG. 4 is another graph, similar to the graph of FIG. 3, showing the performance of binary BCH codes of length 127, on a BPSK-modulated AWGN channel, under the algebraic soft-decision decoding method of the present invention. The plurality of curves 70-85 show the probability that the transmitted codeword is not on the list produced by (the factorization step in) the algebraic soft-decision decoder, for a large number of interpolation points (s→∞). Said probability is plotted versus the SNR and versus the dimension of the corresponding BCH codes.

FIG. 5 is a graph showing the performance of the (63,36,11) binary BCH code on a BPSK-modulated AWGN channel, under different decoding methods. The six different curves correspond to the performance achieved by six different decoding methods: classical Berlekamp-Welch hard-decision decoding up to half the minimum distance (see L. R. Welch, E. R. Berlekamp, Error Correction for Algebraic Block Codes, U.S. Pat. No. 4,633,470, issued Dec. 30, 1986) 90 hard-decision list decoding of Guruswami-Sudan (see V. Guruswami, M. Sudan, op. cit.) 91, GMD soft-decision decoding (see G. D. Forney, Jr., Generalized minimum distance decoding, IEEE Trans. Inform. Theory, vol. 12, pp. 125-131, 1966) 92, Chase-3 soft-decision decoding algorithm (see D. Chase, A class of algorithms for decoding block codes with channel measurement information, IEEE Trans. Inform. Theory, vol. 18, pp. 170-182, 1972) 93, algebraic soft-decision decoding of the present invention 94, and maximum-likelihood soft-decision decoding 95.

FIG. 6 is another graph, similar to the graph of FIG. 5, showing the performance of the (127,71,19) binary BCH code on a BPSK-modulated AWGN channel, under different decoding methods. The six different curves correspond to the performance achieved by six different decoding methods: classical Berlekamp-Welch hard-decision decoding up to half the minimum distance (see L. R. Welch, E. R. Berlekamp, op. cit.) 100, hard-decision list decoding of Guruswami-Sudan (see V. Guruswami, M. Sudan, op. cit.) 101, GMD soft-decision decoding (see G. D. Forney, Jr., op. cit.) 102, Chase-3 soft-decision decoding algorithm (see D. Chase, op. cit.) 103, algebraic soft-decision decoding of the present invention 104, and maximum-likelihood soft-decision decoding 105.

FIG. 7 is a graph showing the fraction t/n of correctable errors using binary BCH codes on a binary symmetric channel, plotted as a function of the rate R of the underlying Reed-Solomon code, for three different algorithms: classical Berlekamp-Welch hard-decision decoding up to half the minimum distance (see L. R. Welch and E. R. Berlekamp, op. cit.) 110, hard-decision list decoding of Guruswami-Sudan (see V. Guruswami, M. Sudan, op. cit.) 111, and the algebraic soft-decision decoding of the present invention 112.

FIG. 8 is a graph showing the number t of corretable errors using binary BCH codes of length 63 on a binary symmetric channel, plotted as a function of the dimension of said BCH codes, for three different algorithms: classical Berlekamp-Welch hard-decision decoding up to half the minimum distance (see L. R. Welch, E. R. Berlekamp, op. cit.) 120, hard-decision list decoding of Guruswami-Sudan (see V. Guruswami, M. Sudan, op. cit.) 121, and the algebraic soft-decision decoding of the present invention 122. Also plotted in FIG. 8, for comparison, is the well-known sphere-packing bound on t 123.

FIG. 9 is another graph, similar to the graph of FIG. 8, showing the number t of correctable errors using binary BCH codes of length 127 on a binary symmetric channel, plotted as a function of the dimension of said BCH codes, for three different algorithms: classical Berlekamp-Welch hard-decision decoding up to half the minimum distance (see L. R. Welch, E. R. Berlekamp, op. cit.) 130, hard-decision list decoding of Guruswami-Sudan (see V. Guruswami, M. Sudan, op. cit.) 131, and the algebraic soft-decision decoding of the present invention 132. Also plotted in FIG. 9, for comparison, is the well-known sphere-bound on 133.

DESCRIPTION OF THE PREFERRED EMBODIMENT

Although specific embodiments of the invention will now be described, it should be understood that such embodiments are by way of example only and are merely illustrative of but a small number of the many possible specific embodiments to which the principles of the invention may be applied. Various changes and modifications obvious to one skilled in the art to which the invention pertains are deemed to be within the spirit, scope, and contemplation of the invention as further defined in the appended claims.

1. Preliminaries

Let _(q) be the finite field with q elements. The ring of polynomials over _(q) in a variable X is denoted _(q)[X]. Reed-Solomon codes are obtained by evaluating subspaces of _(q)[X] in a set of points D={x₁, x₂, . . . , x_(n)} which is a subset of _(q). Specifically, the RS code _(q)(n,k) of length n≦q, where n≧1, and dimension k≦n, where k≧1, is defined as follows: $\begin{matrix} {{{\mathbb{C}}_{q}\left( {n,k} \right)}\overset{def}{=}\left\{ {{\left( {{f\left( x_{1} \right)},\ldots \quad,{f\left( x_{n} \right)}} \right):x_{n}},\ldots \quad,{x_{n} \in D},{{f(X)} \in {_{q}\lbrack X\rbrack}},{{\deg \quad {f(X)}} < k}} \right\}} & (1) \end{matrix}$

The point set D is usually taken as _(q) or as *_(q), the set of all nonzero elements of _(q). Unless stated otherwise, we shall henceforth assume that D=*_(q), so that n=q−1. The set of polynomials of degree less than k in _(q)[X] is a linear space, which together with the linearity of the evaluation map (1) establishes that _(q)(n,k) is a linear code. The minimum Hamming distance of _(q)(n,k) is d=n−k+1, which follows from the fact that any non-zero polynomial of degree less than k evaluates to zero in less than k positions.

Given an arbitrary vector y∈_(q) ^(n), the hard-decision decoding task consists of finding the codeword c∈_(q)(n,k) such that the Hamming weight wt(e) of the error vector e=y−c is minimized. The Berlekamp-Welch algorithm (L. R. Welch, E. R. Berlekamp, Error correction for algebraic block codes, U.S. Pat. No. 4,633,470) is a well-known algorithm that accomplishes this task provided wt(e)<d/2. Generalizing upon Berlekamp-Welch, M. Sudan (Decoding of Reed-Solomon codes beyond the error correction bound, J. Complexity, vol. 12, pp. 180-193, 1997, op cit.) and V. Guruswami, M. Sudan (Improved decoding of Reed-Solomon and algebraic-geometric codes, IEEE Trans. Inform. Theory, vol. 45, pp. 1755-1764, 1999, op. cit.) derived a polynomial-time algorithm that achieves error correction substantially beyond half the minimum distance of the code. In the remainder of this section we describe the essential elements of this algorithm.

Definition 1. Let A(X,Y)= $\sum\limits_{i = 0}^{\infty}\quad {\sum\limits_{j = 0}^{\infty}\quad {a_{i,j}X^{i}Y^{j}}}$

be a bivariate polynomial over _(q) and let w_(X,w) _(Y) be non-negative real numbers. The (w_(X),w_(Y))-weighted degree of A(X,Y) is defined as the maximum over all numbers iw_(X)+jw_(Y) such that a_(i,j)≠0.

The (1,1)-weighted degree is simply the degree of a bivariate polynomial. The number of monomials of (w_(X),w_(Y))-weighted degree at most δ is denoted N_(w) _(X) _(,w) _(Y) (δ). Thus ${N_{w_{X},w_{Y}}(\delta)}{\overset{def}{=}{\left\{ {{X^{i}{Y^{j}:i}},{j \geq {{0\quad {and}\quad {iw}_{X}} + {jw}_{Y}} \leq \delta}} \right\} }}$

The following lemma provides a closed-form expression for N_(w) _(X) _(,w) _(Y) (δ) for the case w_(X)=1. Similar statements can be found in V. Guruswami, M. Sudan, op. cit.; R. R. Nielsen, T. Høholdt, Decoding Reed-Solomon codes beyond the minimum distance, preprint, 1998; R. M. Roth, G. Ruckenstein, Efficient decoding of Reed-Solomon codes beyond half the minimum distance, IEEE Trans. Inform. Theory, vol. 46, pp. 246-258, January 2000; and M. Sudan op. cit. Let [x] denote the largest integer not greater than x. $\begin{matrix} {{Lemma}\quad 1.} & \quad \\ {{N_{1,k}(\delta)} = {\left\lbrack \frac{\delta + 1}{k} \right\rbrack \left( {\delta - {\frac{k}{2}\left\lbrack \frac{\delta}{k} \right\rbrack} + 1} \right)}} & \quad \end{matrix}$

The lemma follows by straightforward counting of monomials; for a detailed proof, see V. Guruswami, M. Sudan, op. cit. The exact, but elaborate, expression in Lemma 1 can be converted into a simple lower bound as follows: $\begin{matrix} {{N_{1,k}(\delta)} = {{\frac{\left( {\delta + 1} \right)^{2}}{2\left( {k - 1} \right)} + {\frac{k - 1}{2}\left( {\left\lbrack \frac{\delta + 1}{k - 1} \right\rbrack - \left( {\left\lbrack \frac{\delta + 1}{k - 1} \right\rbrack - \frac{\delta + 1}{k - 1}} \right)^{2}} \right)}} > \frac{\delta^{2}}{2k}}} & (2) \end{matrix}$

Given the vector y=c−e=(y₁,y₂, . . . , y_(n)) and the corresponding point set D={x₁,x₂, . . . , x_(n)}, we consider the set of pairs P={(x₁,y₂), (x₂,y₂), . . . , (x_(n),y_(n))} as points two-dimensional affine space. Given a point (α,β) and a bivariate polynomial A(X,Y) in _(q)[X,Y], we say that (α,β) lies on A(X,Y) if A(α,β)=0. Equivalently, we say that A(X,Y) passes through the point (α,β).

In the error-free case, we have e=0 and therefore y_(i)=c_(i)=ƒ(x_(i)) for a unique polynomial ƒ(X)∈_(q)[X] of degree less than k. It follows that all points in P lie on a polynomial of type Y−ƒ(X), where deg ƒ(X)<k. Furthermore, this polynomial is unique, and can be found by interpolation techniques. If some of the positions of the error vector e are nonzero, then no polynomial of type Y−ƒ(X) with deg ƒ(X)<k passes through all the points in P, unless e itself is a codeword. In this case, we let Λ(X,Y) be an error-locator polynomial in _(q)[X,Y], defined by the property that Λ(x_(i),y_(i))=0 whenever e_(i)≠0. The error-locator polynomial annihilates the effect of the nonzero error vector. In other words, one can always find a polynomial of type Λ(X,Y)(Y−ƒX)) with deg ƒ(X)<k that passes through all the points of P. This leads to the following interpolation problem: find a bivariate polynomial A(X,Y) of type

A(X,Y)=Λ(X,Y)(Y−ƒ(X)) with deg ƒ(X)<k  (3)

such that A(X,Y) passes through all (x_(i),y_(i))∈P and Y−ƒ(X) passes through a maximal number of points in P. Providing wt(e)<d/2, a solution to (3) solves the decoding problem since then the polynomial ƒ(X) corresponds to the transmitted codeword c=y−e.

The algorithm of V. Guruswami, M. Sudan, op. cit., results by relaxing the classical interpolation problem (3) and, at the same time, constraining it further. We relax the requirement that A(X,Y) should be of the type Λ(X,Y)(Y−ƒ(X)). On the other hand, we will be interested in bivariate polynomials that not only pass through all the points in P but do so with high multiplicity.

Definition 2. A bivariate polynomial A(X,Y) is said to pass through a point (α,β) with multiplicity m if the shifted polynomial A(X+α, Y+β) contains a monomial of degree m and does not contain a monomial of degree less than m. Equivalently, the point (α,β) is said to be a zero of multiplicity m of the polynomial A(X,Y).

Using a well-known explicit relation between the coefficients of a bivariate polynomial A(X,Y) and the coefficients of the shifted polynomial, we find that Definition 2 imposes the following linear constraints $\begin{matrix} {{{\sum\limits_{i = k}^{\infty}\quad {\sum\limits_{j = 1}^{\infty}\quad {\begin{pmatrix} i \\ j \end{pmatrix}\begin{pmatrix} j \\ l \end{pmatrix}\alpha^{i - k}\beta^{j - l}a_{ij}}}} = {0\quad {for}\quad {all}\quad k}},{l \geq {{0\quad {such}\quad {that}\quad k} + l} < m}} & (4) \end{matrix}$

on the coefficients a_(i,j) of A(X,Y). Thus A(X,Y) passes through a given point with multiplicity at least m if and only if its coefficients a_(i,j) satisfy the m(m+1)/2 constraints specified by (4). We are now ready to formulate the first step of the Sudan algorithm.

Interpolation step: Given the set P of points in _(q) × _(q) and a positive integer m, compute the nontrivial bivariate polynomial Q_(P)(X,Y) of minimal (1,k-1)-weighted degree that passes through all the points in P with multiplicity at least m.

The number of linear constraints imposed on the coefficients of Q_(P)(X,Y) by the interpolation step is nm(m+1)/2. If the (1,k−1) weighted degree of Q_(P)(X,Y) is δ, then Q_(P)(X,Y) may have up to N_(1,k−1)(δ) nonzero coefficients. These coefficients should be chosen so as to satisfy the nm(m+1)/2 linear constraints of the type (4). This produces a system of nm(m+1)/2 linear equations (not all of them necessarily linearly independent) over _(q) in N_(1,k−1)(δ) unknowns. It is clear that this system has a solution as long as $\begin{matrix} {{N_{1,{k - 1}}(\delta)} > \frac{n\quad {m\left( {m + 1} \right)}}{2}} & (5) \end{matrix}$

For an efficient algorithm to solve such a system of equations and, hence, accomplish the interpolation step of the Sudan algorithm, see R. Kötter, On Algebraic Decoding of Algebraic-Geometric and Cyclic Codes, p. 88, Ph.D. Thesis, Prentice-Hall, 1991; R. R. Nielsen, T. Høholdt, Decoding Reed-Solomon codes beyond half the minimum distance, preprint, 1998 (see also R. M. Roth, G. Ruckenstein, Efficient decoding of Reed-Solomon codes beyond half the minimum distance, IEEE Trans. Inform. Theory, vol. 46, pp. 246-258, January 2000).

The idea of Sudan's algorithm is that, under certain constraints on the weight of the error vector, the relaxation of the classical interpolation problem (3) to the problem of finding the polynomial Q_(P)(X,Y) does not affect the solution. Hence, we hope that Q_(P)(X,Y) factors as Λ(X,Y)(Y−ƒ(X)) and, reversing arguments, we can read-off a list of tentative decoding decisions as factors of Q_(P)(X,Y) of type Y−ƒ(X). Thus the second (and last) step of the Sudan algorithm is as follows.

Factorization step: Given bivariate polynomial Q_(P)(X,Y), identify all the factors of Q_(P)(X,Y) of type Y − f(X) with deg f(X) < k. The output of the algorithm is a list of the codewords that correspond to these factors.

The factorization of Q_(P)(X,Y) into factors of type Y−ƒ(X) with deg ƒ(X)<k is a surprisingly simple task. Efficient algorithms to accomplish such factorization can be found in D. Augot, L. Pecquet, A Hensel lifting to replace factorization in list-decoding of algebraic-geometric and Reed-Solomon codes, IEEE Trans. Inform. Theory, vol. 46, November 2000, in press; and M. A. Shokrollahi, Computing roots of polynomials over function fields of curves, preprint, 1998, for example. The fundamental question is under which conditions can one guarantee that the correct decoding decision is found on the list produced by the factorization step.

Theorem 2. Suppose that a vector y and a positive integer m are given. Then the factorization step produces a list that contains all codewords of _(q)(n,k) at distance less than $\begin{matrix} {t = {{n - \left\lbrack \frac{\delta}{m} \right\rbrack} > \left\lbrack {n\left( {1 - \sqrt{R\quad \frac{m + 1}{m}}} \right)} \right\rbrack}} & (6) \end{matrix}$

from y, where δ is the smallest integer such that N_(1,k−1)(δ)>nm(m+1)/2 and R=k/n.

For a proof of (6), see V. Guruswami, M. Sudan, op. cit. and R. R. Nielsen, T. Høholdt, op. cit. The inequality in (6) follows from (2). We note that Theorem 2 is a special case of Theorem 3, which we prove in the next section.

2. Algebraic Soft-decision Decoding of Reed-Solomon Codes

In many situations (see E. R. Berlekamp, R. E. Peile, S. P. Pope, “The application of error control to communications,” IEEE Commun. Mag., vol. 25, pp. 44-57, 1987; A. Vardy, Y. Be'ery, “Bit-level soft-decision decoding of Reed-Solomon codes”, IEEE Trans. Commun., vol. 39, pp. 440-445, March 1991), the decoder can be supplied with probablistic reliability information concerning the received symbols. A decoding algorithm that utilizes such information is generally referred to as a soft-decision decoding algorithm. We now specify this notion precisely, in the context of the present invention. First, we define a memoryless channel, or simply a channel, as a collection of a finite input alphabet , an output alphabet , and functions

ƒ(·|x): →[0,1] for all x∈

that are assumed to be known to the decoder. It is possible to think of channel input and output as random variables X and Y, respectively, and assume that X is uniformly distributed over . If the channel is continuous (e.g. Gaussian), then Y is continuous and the ƒ(·|x) are probability-density functions, while if the channel is discrete then Y is discrete and the ƒ(·|x) are probability-mass functions. In either case, the decoder can compute the probability that α∈ was transmitted given that y∈ was observed, as follows $\begin{matrix} {{\Pr \left( {\chi = {\left. \alpha \middle| \mathrm{\Upsilon} \right. = y}} \right)} = {\frac{{f\left( y \middle| \alpha \right)}\quad {\Pr \left( {\chi = \alpha} \right)}}{\sum\limits_{x \in}\quad {\aleph \quad {f\left( y \middle| x \right)}\quad {\Pr \left( {\chi = x} \right)}}} = \frac{f\left( y \middle| \alpha \right)}{{\sum\limits_{x \in}\quad {\aleph \quad {f\left( y \middle| x \right)}}}\quad}}} & (7) \end{matrix}$

where the second equality follows from the assumption that X is uniform. For Reed-Solomon codes, the input alphabet is always fixed to =_(q). Henceforth, let α₁,α₂, . . . , α_(q) be a fixed ordering of the elements of _(q); this ordering will be implicitly assumed in the remainder of this specification. Given the vector y=(y_(i),y₂, . . . , y_(n))∈^(n) observed at the channel output, we compute $\begin{matrix} {{\pi_{i,j}\overset{def}{=}{{{\Pr \left( {\chi = {\left. \alpha_{i} \middle| \mathrm{\Upsilon} \right. = y_{j}}} \right)}\quad {for}\quad i} = 1}},2,\ldots \quad,{{q\quad {and}\quad j} = 1},2,\ldots \quad,n} & (8) \end{matrix}$

according to the expression in (7). Let Π be the q×n matrix with entries π_(i,j) defined in (8). We will refer to Π as the reliability matrix and assume that Π is the input to the algebraic soft-decision decoding algorithm. For notational convenience, we will sometimes write Π(α,β) to refer to the entry found in the j-th column of Π in the row indexed by α∈_(q). We note that in some applications (see E. R. Berlekamp, R. E. Peile, S. P. Pope, The application of error control to communications, IEEE Commun. Mag., vol. 25, pp. 44-57, 1987; W. W. Wu, D. Haccoun, R. E. Peile, Y. Hirata, “Coding for satellite communication”, IEEE J. Select. Areas Commun., vol. 5, pp. 724-785, 1987), it is the reliability matrix Π rather than the vector y∈^(n) that is directly available at the channel output. In many other cases, the channel output alphabet is quite different from _(q). The first step in hard-decision decoding is thus the construction of the hard-decision vector u=(u₁,u₂, . . . , u_(n))∈_(q) ^(n), where $\begin{matrix} {{u_{j}\overset{def}{=}{{\arg \quad {\max_{\alpha \quad \in _{q}}{{\Pi \left( {\alpha,j} \right)}\quad {for}\quad j}}} = 1}},2,\ldots \quad,n} & (9) \end{matrix}$

This hard-decision vector is then taken as the channel output y=c+e, thereby converting the channel at hand into a hard-decision channel.

On the other hand, a soft-decision decoder works directly with the probabilities compiled in the reliability matrix Π. If the decoder is algebraic, it must convert these probabilities into algebraic conditions. Before presenting a formal description of the proposed soft-decision decoding procedures, we give an example that illustrates the main idea. This example was also discussed in detail in the Summary part of this specification.

Example 1. Let q=5, so that _(q) is the set of integers {0, 1, 2, 3, 4} with operations modulo 5. We take D=_(q)=₅ and consider the Reed-Solomon codes ₅ (5,2) defined as $\begin{matrix} \left. {{{{\mathbb{C}}_{5}\left( {5,2} \right)}\overset{def}{=}{{\left. \left\{ {{f(0)},{f(1)},{f(2)},{f(3)},{f(4)}} \right. \right):{f(X)}} = {a + {{bX}\quad {with}\quad a}}}},{b \in {\mathbb{Z}}_{5}}} \right\} & (10) \end{matrix}$

Suppose that the codeword c=(1, 2, 3, 4, 0) corresponding to ƒ(X)=1+X was transmitted, resulting in the following reliability matrix $\begin{matrix} {\Pi = \begin{bmatrix} 0.01 & 0.0025 & 0.05 & 0.14 & 0.20 \\ 0.06 & 0.0025 & 0.09 & 0.14 & 0.05 \\ 0.02 & 0.9900 & 0.15 & 0.07 & 0.20 \\ 0.01 & 0.0012 & 0.61 & 0.44 & 0.40 \\ 0.90 & 0.0038 & 0.10 & 0.21 & 0.15 \end{bmatrix}} & (11) \end{matrix}$

We assume, as before, that the rows and columns of Π are indexed by the elements 0,1,2,3,4 of ₅, in this order. The hard-decision vector derived from Π according to (9) is u=(4,2,3,3,3), which corresponds to errors in positions 0, 3 and 4.

It follows that even a maximum-likelihood hard-decision decoder will fail to reconstruct the transmitted codeword c, since there exists another codeword (3, 3, 3, 3, 3)∈₅ (5,2) that is closer to u in the Hamming metric. The list-decoding algorithm of Guruswami-Sudan (V. Guruswami, M. Sudan, op. cit.) will fail as well, since the number of erroneous positions exceeds the error-correction capability of the algorithm (cf. Theorem 2). The GMD soft-decision decoding algorithm (G. D. Forney, Jr., Generalized minimum distance decoding, IEEE Trans. Inform. Theory, vol. 12, pp. 125-131, April 1966) will also fail to reconstruct c=(1, 2, 3, 4, 0). Since the last three positions in u=(4, 2, 3, 3, 3) are the least reliable, the GMD decoder will perform two decoding trials, attempting to correct u′=(4, 2, 3, 3, φ) and u″ =(4, 2, φ, φ, φ), where φ denotes erasure. However, the decoder will produce (4, 2, 0, 3, 1)∈₅ (5,2) in both trials.

Nevertheless, we now show that the transmitted codeword can, in fact, be reconstructed without resorting to full maximum-likelihood soft-decision decoding. Given the information in the reliability matrix Π, it is clear that it would be beneficial to weigh each position differently. This can be accomplished by requiring, during the interpolation step, that the polynomial Q(X,Y) passes through different points with different multiplicities. More precisely, we would like to select the interpolation points and their multiplicities so as to reflect the information in Π in as much as possible. A simple greedy procedure for this purpose is derived in the next section. This procedure was also illustrated in the Summary part of this specification. For the special case of our example, the procedure produces the following list

point (x,y) (1,2) (0,4) (2,3) (3,3) (4,3) multiplicity 3 2 2 1 1 (12)

These points and multiplicities are shown to be optimal for the reliability matrix in (11), in a precise sense described in the next section. The minimal (1,1)-weighted degree polynomial that passes through all the points in (12) with the required multiplicities turns out to be $\begin{matrix} {{Q\left( {X,Y} \right)} = \quad {1 + X + Y - X^{2} - Y^{2} - {2X^{2}Y} + {Y^{2}X} - Y^{3} + X^{4} -}} \\ {\quad {{2{YX}^{3}} - {X^{2}Y^{2}} + {2Y^{3}X}}} \\ {= \quad {\left( {Y - X - 1} \right)\left( {Y - {3X} - 4} \right)\left( {1 + Y + {3X} + {3X^{2}} + {3{XY}}} \right)}} \end{matrix}$

We identify the two solutions ƒ₁(X)=1+X and ƒ₂(X)=4+3X as corresponding to the codewords (1, 2, 3 ,4, 0) and (4, 2, 0, 3, 1), respectively. Referring to the reliability matrix in (11) once again, we see that

Π(1,0)Π(2,1)Π(3,2)Π(4,3)Π(0,4)>Π(4,0)Π(2,1)Π(0,2)Π(3,3)Π(1,4)

Thus the transmitted codeword c=(1, 2, 3, 4, 0) is more likely than (4, 2, 0, 3, 1) given the observations; it will therefore be selected as the decoder output.

Example 1 shows how a soft-decision decoding algorithm for Reed-Solomon codes might work. The “soft” reliability information will enter the decoding process through the choice of interpolation points and their multiplicities. Given a reliability matrix Π, the key question is how to determine these points. An answer to this question is provided in the next section. In the remainder of this section, we characterize the proposed soft-decoding algorithm for a given choice of interpolation points and their multiplicities.

A convenient way to keep track of the interpolation points and their multiplicities is by means of a multiplicity matrix. A multiplicity matrix is a q×n matrix M with non-negative integer entries m_(i,j). The first step of our decoding algorithm consists of computing the multiplicity matrix M from the reliability matrix Π (see Algorithm A of the next section). The second step is the “soft” interpolation step, which may be expressed as follows.

Soft interpolation step: Given the point set D = {x₁,x₂,...,x_(n)) and the multiplicity matrix M = [m_(ij)], compute the nontrivial bivariate polynomial Q_(M)(X,Y) of minimal (1,k-1)-weighted degree that has a zero of multiplicity at least m_(ij) at the point (x_(j),α_(i)) for every i,j such that m_(ij) ≠ 0.

Note that while the bivariate polynomial Q_(M)(X,Y) of minimal (1,k−1)-weighted degree is optimal and best selected for use in our method, it is also possible to use a polynomial of other than the least possible (1,k−1)-weighted degree, which polynomial would be sub-optimal. The method will still work, but less efficiently so.

The third step of our algorithm is the factorization step, which is identical to the factorization step of the Sudan algorithm, described in the previous section.

Notice that the soft interpolation step of our algorithm is different from the weighted polynomial reconstruction problem considered in V. Guruswami, M. Sudan, op. cit. Guruswami and Sudan take a fixed set P={(x₁,y₂),(x₂,y₂), . . . , (x_(n),y_(n))} of n points in _(q×) _(q) and seek a polynomial Q_(P)(X,Y) that passes through these n points with prescribed multiplicities m₁,m₂, . . . , m_(n). In contrast, in our algorithm, there is no constraint on the number of nonzero entries in the multiplicity matrix M and/or on their distribution. In fact, in most cases of interest, the polynomial Q_(M)(X,Y) computed in the soft interpolation step will pass through much more than n points with nonzero multiplicity.

It is easy to see from (4) and (5) that the computation of the polynomial Q_(M)(X,Y) is equivalent to solving a system of linear equations of the type (4). The complexity of this computation depends primarily on the number of equations in the system.

Definition 3. Given a q×n matrix M with non-negative integer entries m_(i,j), we define the cost of M as follows ${(M)}\overset{def}{=}{\frac{1}{2}{\sum\limits_{i = 1}^{q}\quad {\sum\limits_{j = 1}^{n}\quad {m_{i,j}\left( {m_{i,j} + 1} \right)}}}}$

As observed in (4), a given zero of multiplicity m imposes nm(m+1)/2 linear constraints of Q_(M)(X,Y). Thus (M) is precisely the total number of linear equations. It follows that the complexity of computing Q_(M)(X,Y) is governed by the cost of the multiplicity matrix M. As in (5), we can always find a solution Q_(M)(X,Y) to the soft interpolation task if the (1,k−1)-weighted degree 5 is large enough, namely if

N _(1,k−1)(δ)>(M)  (13)

so that the number of degrees of freedom is greater than the number of linear constraints. The solution Q_(M)(X,Y) is said to contain a codeword c∈_(q)(n,k) if the factorization of Q_(M)(X,Y) contains a factor Y−ƒ(X) such that c=(ƒ(x₁),ƒ(x₂), . . . , ƒ(x_(n))). The fundamental question is under which conditions does the polynomial Q_(M)(X,Y) found in the soft interpolation step contain a given codeword c∈_(q)(n,k).

The answer to this question is given in Theorem 3 below. In order to state this theorem, we need some more notation. First, we define the function $\begin{matrix} {{\Delta_{w_{X},w_{Y}}(v)}\overset{def}{=}{\min \left\{ {\delta \in {{\mathbb{Z}}:{{N_{w_{X},w_{Y}}(\delta)} \geq v}}} \right\}}} & (14) \end{matrix}$

Notice that Δ_(1,k−1)(v)≦{square root over (2kv)} in view of (2). Next, given two q×n matrices A and B over the same field, we define the inner product ${{\langle{A,B}\rangle}\overset{def}{=}{\sum\limits_{i = 1}^{q}\quad {\sum\limits_{j = 1}^{n}\quad {a_{i,j}b_{i,j}}}}}$

Finally, it will be convenient to think of the codewords of the Reed-Solomon code _(q) (n,k) as q×n matrices over the reals. Specifically, any vector v=(v₁,v₂, . . . , v_(n)) over _(q) can be represented by the q×n real-valued matrix [v] defined as follows: [v]_(i,j)=1 if v_(j)=α_(i), and [v]_(i,j)=0 otherwise. Notice that, by construction, the matrix [v] contains precisely one nonzero entry in each column. With this notation, we have the following definition.

Definition 4. The score of a vector v=(v₁,v₂, . . . , v_(n)) over _(q) with respect to a given multiplicity matrix M is defined as the inner product S_(M)(v)=(M, [v]).

Consider now Theorem 3:

Theorem 3. For a given multiplicity matrix M, the polynomial Q_(M)(X,Y) contains a codeword c∈_(q)(n,k) if the score of c is large enough compared to the cost of M, namely if

S _(M)(c)>Δ_(1,k−1)((M))  (15)

Consider now the proof of Theorem 3:

Let c=(c₁,c₂, . . . , c_(n)) be a codeword of _(q)(n,k), let D={x₁,x₂, . . . , x_(n)} be the set of points used in the construction of _(q)(n,k), and let ƒ(X) be the polynomial generating c, so that c_(j)=ƒ(x_(j)) for j=1,2, . . . , c_(n). Given the bivariate polynomial Q_(M)(X,Y), we define the polynomial g(X)∈_(q)[X] as follows ${g(X)}\overset{def}{=}{Q_{M}\left( {X,{f(X)}} \right)}$

It would clearly suffice to prove that (15) implies that g(X) is the all-zero polynomial, since then Q_(M)(X,Y) must be divisible by Y−ƒ(X). To prove that g(X)≡0, we will show that deg g(X)≦Δ_(1,k−1)((M)) and yet g(X) has a factor of degree S_(M)(c). We write

S _(M)(c)=<M,[c]>=m ₁ +m ₂ + . . . +m _(n)  (16)

where m_(j) denotes the entry found in the j-th column of M in the row indexed by c_(j). Thus the polynomial Q_(M)(X,Y) passes through the point (x_(j),c_(j)) with multiplicity at least m_(j), for j=1,2, . . . , n.

Consider now Lemma 4:

Lemma 4. Suppose that a bivariate polynomial Q(X,Y) passes through a point (α,β) in _(q)×_(q) with multiplicity at least m, and let p(X) be any polynomial _(q)[X] such that p(α)=β. Then the polynomial Q(X,p(X)) is divisible by (X−α)^(m).

This lemma is identical to Lemma 4 of V. Guruswami, M. Sudan, op. cit., and we omit the proof. Since ƒ(x_(j))=c_(j) for all j=1,2, . . . , n, it follows from Lemma 4 that the polynomial g(X)=Q(X,ƒ(X)) is divisible by the product

(X−x ₁)^(m) ^(₁) (X−x ₂)^(m) ^(₂) . . . (X−x _(n))^(m) ^(_(n))

The degree of this product is S_(M)(c) by (16). Therefore, either degg(X)≧S_(M)(c) or g(X)≡0. Since deg ƒ(X)≦k−1, the degree of g(X)=Q_(M)(X,ƒ(X)) cannot exceed the (1,k-1)-weighted degree of Q_(M)(X,Y). Yet it follows from (13) and (14) that the (1,k−1)-weighted degree of Q_(M)(X,Y) is precisely Δ_(1,k−1)((M)). Thus if g(X)≢0 then S_(M)(c)≦deg g(X)≦Δ_(1,k−1)((M)). Hence (15) implies that g(X)≡0. This completes the proof of Theorem 3.

3. Determination of Interpolation Points

This section deals with the conversion of posterior probabilities derived from the channel output into a choice of interpolation points and their multiplicities. More specifically, given a reliability matrix Π, as defined in (8), we would like to compute the multiplicity matrix M that serves as the input to the soft interpolation step of our algorithm.

It is clear from Theorem 3 that the multiplicity matrix M should be chosen so as to maximize the score S_(M)(c) of the transmitted codeword c∈_(q)(n,k) for a given cost (M). Let _(q,n) denote the set of all q×n matrices with nonnegative integer entries m_(i,j), and define ${\mathcal{M}()}\overset{def}{=}\left\{ {{M \in {\mathcal{M}_{q,n}:{\frac{1}{2}{\sum\limits_{i = 1}^{q}\quad {\sum\limits_{j = 1}^{n}\quad {m_{i,j}\left( {m_{i,j} + 1} \right)}}}}}} = } \right\}$

Thus is the finite set of all matrices in _(q,n) whose cost is equal to , and our problem is that of finding the optimal matrix in . We shall see that while this problem is apparently intractable for an arbitrary , for certain specific values of the problem has a simple polynomial-time solution. This will suffice for the purposes of this invention.

Next, we elaborate upon what optimality means in our context. As explained above, we would like to choose M∈ so as to maximize the score of the transmitted codeword c∈_(q)(n,k). However, the transmitted codeword itself is obviously unknown to the decoder; only some stochastic information about c is available through the observation of the channel output (y₁,y₂, . . . , y_(n))∈^(n) and the knowledge of the channel transition probabilities Pr(X=α|Y=y). In fact, as far as the decoder is concerned, the transmitted codeword may be thought of as a random vector, which we denote by X=(X₁,X₂, . . . , X_(n)). For a given multiplicity matrix M, the score of the transmitted codeword is a function of X given by S_(M)(X)=<M,[X]>. Thus S_(M)(X) is a random variable, and a reasonable optimization criterion is to choose M∈ so as to maximize the expected value of S_(M)(X). Specifically, we define the expected score with respect to a probability distribution P(·) on X as follows ${E_{P}\left\{ {S_{M}(X)} \right\}}\overset{def}{=}{{\sum\limits_{x \in \aleph^{n}}\quad {{S_{M}(x)}{P(x)}}} = {\sum\limits_{x \in _{q}^{n}}\quad {\sum\limits_{j = 1}^{n}\quad {{M\left( {x_{j},j} \right)}{P(x)}}}}}$

where M(x_(j),j) denotes the entry found in the j-th column of M in the row indexed x_(j). It remains to specify P(·). For this purpose, we adopt the product distribution determined by the channel output (y₁,y₂, . . . , y_(n))∈^(n), namely $\begin{matrix} {{P\left( {x_{1},x_{2},\ldots \quad,x_{n}} \right)}\overset{def}{=}{{\prod\limits_{j = 1}^{n}\quad {\Pr \left( {\chi_{j} = {\left. x_{j} \middle| \mathrm{\Upsilon}_{j} \right. = y_{j}}} \right)}} = \quad {\prod\limits_{j = 1}^{n}\quad {\Pi \left( {x_{j},j} \right)}}}} & (17) \end{matrix}$

where Π is the reliability matrix defined in (8). It is easy to see that this would be the a posteriori distribution of X given the channel observations, if the a priori distribution of X were uniform over the space _(q) ^(n). The remainder of this section is concerned with the computation of the matrix M(Π,) defined as follows $\begin{matrix} {{M\left( {\Pi,} \right)}\overset{def}{=}{\arg \quad {\max_{M \in {\mathcal{M}{()}}}{E_{P}\left\{ {S_{M}(X)} \right\}}}}} & (18) \end{matrix}$

where the expectation is taken with respect to the probability distribution P(·) in (17). We start with the following lemma, which gives a useful expression for the expected score.

Lemma 5. The expected score with respect to the probability distribution in (17) is equal to the inner product of the multiplicity matrix and the reliability matrix, namely

E _(P) {S _(M)(X)}=<M,Π>  (19)

Consider now the proof of Lemma 5:

The lemma follows from the fact that if X is distributed according to (17), then the reliability matrix Π is precisely the expected value of [X]. To see this, consider the random vector X′=(X₁,X₂, . . . , X_(n−1)) obtained by deleting the last component of X=(X₁,X₂, . . . , X_(n)). The probability distribution of X′ can be obtained by marginalizing (17) with respect to X_(n). Explicitly, if x′=(X₁,X₂, . . . , X_(n−1))∈_(q) ^(n−1) then ${\Pr \left( {X^{\prime} = x^{\prime}} \right)} = {{\sum\limits_{x_{n} \in _{q}}\quad {P\left( {x_{1},x_{2},\ldots \quad,x_{n - 1},x_{n}} \right)}} = {{\sum\limits_{x_{n} \in _{q}}{\prod\limits_{j = 1}^{n}{\Pi \left( {x_{j},j} \right)}}} = {\prod\limits_{j = 1}^{n}{\Pi \left( {x_{j},j} \right)}}}}$

where the last equality follows from the fact that ${\sum\limits_{x_{n} \in _{q}}{\Pi \left( {x_{n},n} \right)}} = 1.$

This, in particular, implies that $\begin{matrix} {1 = {{\sum\limits_{x \in _{q}^{n - 1}}{\Pr \left( {X^{\prime} = x^{\prime}} \right)}} = {{\sum\limits_{x^{\prime} \in _{q}^{n - 1}}\quad {\prod\limits_{j = 1}^{n - 1}{\Pi \left( {x_{j},j} \right)}}} = {\underset{x_{j} = \alpha}{\sum\limits_{x \in _{q}^{n}}}{\underset{l \neq j}{\prod\limits_{l = 1}^{n}}\quad {\Pi \left( {x_{l},l} \right)}}}}}} & (20) \end{matrix}$

for any j∈{1,2, . . . , n} and any α∈_(q). The last equality in (20) follows by applying a similar argument to the random vector obtained by deleting the j-th component of X. Now consider the q×n matrix =[p_(i,j)] which may be thought of as the expected value of [X] with respect to the distribution P(·) in (17). Specifically, we define as follows $\begin{matrix} {\overset{def}{=}{\sum\limits_{x \in _{q}^{n}}{\lbrack x\rbrack {P(x)}}}} & (21) \end{matrix}$

Since [x]_(i,j)=1 if x_(i)=α_(i), and [x]_(i,j)=1 otherwise, the entry found in row i and column j of the matrix is given by $\begin{matrix} {p_{i,j} = {{\sum\limits_{\underset{x_{j} = \alpha_{i}}{x \in _{q}^{n}}}{P(x)}} = {{\sum\limits_{\underset{x_{j} = \alpha_{i}}{x \in _{q}^{n}}}{\prod\limits_{l = 1}^{n}\quad {\Pi \left( {ϰ_{l},l} \right)}}} = {{\Pi \left( {\alpha_{i},j} \right)}{\sum\limits_{\underset{x_{j} = \alpha_{i}}{x \in _{q}^{n}}}{\prod\limits_{\underset{l \neq j}{l = 1}}^{n}\quad {\Pi \left( {ϰ_{l},l} \right)}}}}}}} & (22) \end{matrix}$

The summation on the right-hand side of (22) evaluates to 1 by (20), which implies that p_(i,j=Π(α) _(i,j))=π_(i,j) for all i∈{1,2, . . . , q} and all j∈{1,2, . . . , n}. Therefore =Π. The lemma can be now easily proved by interchanging expectation with inner product E_(P){S_(M)(X)}=E_(P){<M,[X]>}=<M,E_(P){[X]}>=<M,Π>. More explicitly, we have $\begin{matrix} {{E_{P}\left\{ {S_{M}(X)} \right\}} = {{\sum\limits_{x \in _{q}^{n}}{{\langle{M,\lbrack x\rbrack}\rangle}{P(x)}}} = {\sum\limits_{x \in _{q}^{n}}{\langle{M,{\lbrack x\rbrack {P(x)}}}\rangle}}}} \\ {= {{\langle{M,{\sum\limits_{x \in _{q}^{n}}{\lbrack x\rbrack {P(x)}}}}\rangle} = {\langle{M,\Pi}\rangle}}} \end{matrix}$

where the first two equalities follows from the linearity of the inner product, while the last equality follows from the definition of =Π in (21). This completes the proof of Lemma 5.

We will construct M(Π,) iteratively, starting with the all-zero matrix and increasing one of the entries in the matrix at each iteration. Referring to Lemma 5, we see that increasing _(i,j) from 0 to 1 increases the expected score by π_(i,j) while increasing the cost by 1. If we require that Q_(M)(X,Y) passes through the same point again—that is, increase m_(i,j) from 1 to 2—then the expected score again grows by π_(i,j), but now we have to “pay” two additional linear constraints. In general, increasing m_(i,j) from a to a+1 always increases the expected score by π_(i,j) while introducing a+1 additional constraints of type (4).

Example 2. Returning to Example 1 of the previous section, consider the code ₅(5,2) given by (10) and the reliability matrix Π in (11). Suppose we restrict the cost of the multiplicity matrix to 14, that is, we wish to find

M(Π,14)=argmax_(M∈) ₍₁₄₎<(M,Π>

We construct a multiplicity matrix M by a greedy iterative process, starting with the 5×5 all-zero matrix, and requiring at each iteration that the newly chosen interpolation point maximizes the increase in the expected score normalized by the number of additional linear constraints (the increase in cost). This process is explained in detail in the Summary part of this specification.

TABLE 1 Iterative construction of a multiplicity matrix itera- interpolation increase increase d score total expected tion point (i,j) in score in cost d cost cost score 1 (1,2) 0.99 1 0.990 1 0.99 2 (0,4) 0.90 1 0.900 2 1.89 3 (2,3) 0.61 1 0.610 3 2.50 4 (1,2) 0.99 2 0.495 5 3.49 5 (0,4) 0.90 2 0.450 7 4.39 6 (3,3) 0.44 1 0.440 8 4.83 7 (4,3) 0.40 1 0.400 9 5.23 8 (1,2) 0.99 3 0.330 12 6.22 9 (2,3) 0.61 2 0.305 14 6.83

Table 1 shows the sequence of chosen interpolation points. Observe that the column that contains the ratio of the increase in the expected score to the increase in cost is strictly decreasing. The resulting multiplicity matrix M is described in equation (12) of Example 1 and in the Summary part of this specification. It can be verified by exhaustive search that max_(M∈) ₍₁₄₎<M,Π>=6.83, so M=M(Π,14). Notice that N_(1,1)(3)=10 while N_(1,1)(4)=15 by Lemma 1, so that Δ_(1,1)(14)=4. Thus the expected score exceeds the minimum score required for successful decoding (cf. Theorem 3) by a factor of about 1.7. This gives a high level of confidence that the actual score of the transmitted codeword will also exceed Δ_(1,k−1)(14)=4. Indeed, the score of c=(1,2,3,4,0)∈₅(5,2) with respect to M is S_(M)(c)=5.

The greedy iterative procedure used in Example 2 turned out to be optimal for that case. We formalize this procedure as Algorithm A:

Algorithm A

Input: Reliability matrix Π and a positive integer s, indicating the total number of interpolation points.

Output: Multiplicity matrix M.

Initialization step: Set Π*←Π and M←all-zero matrix.

Iteration step: Find the position (i,j) of the largest element π*_(i,j) in Π*, and set $\left. \pi_{i,j}^{*}\leftarrow\frac{\pi_{i,j}}{m_{i,j} + 2} \right.$ m_(i, j) ← m_(i, j) + 1 s ← s − 1

Control step: If s=0, return M; otherwise go the iteration step.

Let (Π,s) denote the multiplicity matrix produced by Algorithm A for a given reliability matrix Π and a given number of interpolation points s (counted with multiplicities). The following theorem shows that this matrix is optimal.

Theorem 6. The matrix (Π,s) maximizes the expected score among all matrices in _(q,n) with the same cost. That is, if is the cost of (Π,s), then

(Π,s)=argmax_(M∈) _((C)) <M,Π>

Consider now the proof from Theorem 6:

With each position (i,j) in the reliability matrix Π, we associate an infinite sequence of rectangles _(i,j,1),_(i,j,2), . . . indexed by the positive integers. Let denote the set of all such rectangles. For each rectangle _(i,j,l)∈, we define length(_(i,j,l))=l, and height(_(i,j,l))=π_(i,j)/l, and area(_(i,j,l))=length(_(i,j,l))·height(_(i,j,l))=π_(i,j). For a multiplicity matrix M∈_(q,n), we define the corresponding set of rectangles S(M) as $\begin{matrix} {{S(M)} = \left\{ {{{{()}_{i,j,l}\text{:}\quad 1} \leq i \leq q},{1 \leq j \leq n},\quad {{{and}\quad 1} \leq l \leq m_{i,j}}} \right\}} & (23) \end{matrix}$

Observe that (23) establishes a one-to-one correspondence between the set of all q×n multiplicity matrices and the set of all finite subsets . Also notice that the number of rectangles in S(M) is ${\sum\limits_{i = 1}^{q}\quad {\sum\limits_{j = 1}^{n}\quad m_{i,j}}},$

which is precisely the total number of interpolation points imposed by the multiplicity matrix M (counted with multiplicities). Furthermore  ( M ) =    ∑ i = 1 j = 1 q , n     m i , j  ( m i , j + 1 ) 2 = ∑ i = 1 j = 1 q , n  ∑ l = 1 m i , j  l =    ∑ i = 1 j = 1 q , n  ∑ l = 1 m i , j  length ( i , j , l ) = ∑ ∈   ( M )     length  ( ) 〈 M , Π 〉 =    ∑ i = 1 j = 1 q , n  m i , j · π i , j = ∑ i = 1 j = 1 q , n  ∑ l = 1 m ij  π i , j =    ∑ i = 1 j = 1 q , n  ∑ l = 1 m i , j  area  ( i , j , l ) = ∑ ∈   ( M )     area  ( )

Thus the cost of M is the total length of all the rectangles in S(M) and the expected score (Π,s) is the total area of all the rectangles in S(M). It is intuitively clear that to maximize the total area for a given total length, one has to choose the highest rectangles. This is precisely what Algorithm A does: the algorithm constructs the matrix (Π,s) that corresponds to the set of s highest rectangles in . It is now obvious that if the s highest rectangles in have total length then no collection of rectangles of total length at most can have a larger total area. This completes the proof of Theorem 6.

Although Algorithm A produces an optimal multiplicity matrix (Π,s) for an arbitrary number of interpolation points s, it cannot be used to solve the optimization problem (18) for an arbitrary value of the cost . The algorithm computes a solution to (18) only for those costs that are expressible as the total length of the s highest rectangles in for some s. In other words, if M(Π,1), M(Π,2), M(Π,3), . . . is the infinite sequence of matrices defined by (18) for =1,2,3, . . . , then (Π,1), (Π,2), (Π,3), . . . is a subsequence of this sequence. On the other hand, this is the best one could hope for. It is easy to see from the proof of Theorem 6 that solving (18) for an arbitrary Π and is equivalent to solving the knapsack problem, which is a well-known NP-complete problem (see M. R. Garey, D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, 1979, p. 247).

Observe that any nonzero entry in Π will eventually (for s→∞) give rise to a nonzero entry in (Π,s). As the number of interpolation points increases, we can more and more closely match a given reliability matrix Π. In the limit, as the number s of interpolation points (and, hence, the complexity of the interpolation and factorization tasks) approaches infinity, the proposed soft-decision decoding algorithm reaches its limiting performance. The asymptotic performance analysis for s→∞ (or, equivalently, for →∞) is carried out in Appendix A to this specification. We emphasize that one can approach this limiting performance arbitrarily closely by letting the number s of interpolation points be large but bounded. In this case, the complexity of our soft-decoding algorithm stays bounded by a polynomial in the length of the code.

4. Algebraic Soft-decision Decoding of BCH Codes

Bose-Chaudhuri-Hocquenghem (BCH) codes are easily described as subfield subcodes of Reed-Solomon codes. Let _(q)(n,k) be a Reed-Solomon code over _(q), as defined in (1), where q=p^(m) for a prime p and an integer m≧1. Let q′=p^(m′), where m′ is a divisor of m. Then _(q′) is subfiled _(q). A linear BCH code ′_(q′)(n,k′) over _(q′) is given by $\begin{matrix} {{{\mathbb{C}}_{q^{\prime}}^{\prime}\left( {n,k^{\prime}} \right)}\overset{def}{=}{{{\mathbb{C}}_{q}\left( {n,k} \right)}\bigcap _{q^{\prime}}^{n}}} & (24) \end{matrix}$

It is easy to see from (24) that the dimension of ′_(q′)(n,k′) is k′≦k, and its minimum Hamming distance is d′≧n−k+1. We call d=n−k+1 the designed distance (c.f. F. J. MacWilliams, N. J. A. Sloane, The Theory of Error Correcting Codes, New York: North-Holland, 1977, p. 202) of ′_(q′)(n,k′). Throughout this section, we assume that q=2^(m) and q′=2. Generalization to non-binary BCH codes (arbitrary prime powers q and q′) is straightforward.

For binary BCH codes, the channel input alphabet is ={0,1)} and the channel is characterized by two transition-probability functions ƒ(y|0) and ƒ(y|1), as defined in (7). For y∈, let $\begin{matrix} {{\mu (y)}\overset{def}{=}{\frac{f\left( y \middle| 0 \right)}{{f\left( y \middle| 0 \right)} + {f\left( y \middle| 1 \right)}} = {\Pr \left( {\chi = {\left. 0 \middle| \mathrm{\Upsilon} \right. = y}} \right)}}} & (25) \end{matrix}$

and suppose that the vector y=(y₁,y₂, . . . , y_(n))∈^(n) is observed at the channel output. The corresponding reliability matrix is then the 2×n matrix given by $\begin{matrix} {{\Pi (y)}\overset{def}{=}\begin{bmatrix} {\mu \left( y_{1} \right)} & {\mu \left( y_{2} \right)} & \cdots & {\mu \left( y_{n} \right)} \\ {1 - {\mu \left( y_{1} \right)}} & {1 - {\mu \left( y_{2} \right)}} & \cdots & {1 - {\mu \left( y_{n} \right)}} \end{bmatrix}} & (26) \end{matrix}$

The key to algebraic soft-decision decoding of BCH codes is the following simple idea. The 2×n matrix in (26) can be formally extended to a q×n reliability matrix Π by appending for each α∈_(q)\₂ an all-zero row to Π(y). The algebraic soft-decision decoder, presented with the q×n matrix Π, then proceeds exactly as before, following the three computation steps described in Sections 2 and 3. Note that the multiplicity matrix M computed in the first step (using Algorithm A) is essentially a 2×n matrix, since a zero entry in Π always leads to a zero entry in M. The soft interpolation step thus produces a polynomial Q_(M)(X,Y) with coefficients in _(q) that has zeros (of prescribed multiplicities) at points of type (x_(j),α), where x_(j)∈_(q) and α∈₂. Finally, during the factorization step, we are only interested in those factors of type Y−ƒ(X) for which deg ƒ(X)<k and ƒ(X)∈_(q)[X] evaluates to ₂ for all x_(j) in the defining point-set D.

It follows directly from the discussion above that all the results derived in the preceding three sections for Reed-Solomon codes extend to the case of BCH codes. In particular, Theorem 3 holds without change. In the remainder of this section, we analyze the performance of binary BCH codes under algebraic soft-decision decoding for two channels of particular interest: the binary-input AWGN channel and the binary symmetric channel.

On additive white Gaussian noise channel with BPSK modulation 0→1 and 1→−1, the function μ(y) in (25) takes the form $\begin{matrix} {{\mu (y)} = {\frac{e^{y/\sigma^{2}}}{e^{y/\sigma^{2}} + e^{{- y}/\sigma^{2}}} = \frac{e^{2{y/\sigma^{2}}}}{1 + e^{2{y/\sigma^{2}}}}}} & (27) \end{matrix}$

where σ is the noise variance. The asymptotic condition for successful decoding can be therefore stated as follows.

Theorem 7. Suppose that a codeword of an (n,k′,d′) binary BCH code ′₂(n,k′) is transmitted over a BPSK-modulated AWGN channel, and the vector (y₁,y₂, . . . , y_(n))∈^(n) is observed at the channel output. Then the algebraic soft-decision decoder produces a list that contains a codeword c=(c₁,c₂, . . . , c_(n))∈′₂(n,k′) provided $\frac{\sum\limits_{j = 1}^{n}\quad \frac{e^{2y_{j}{c_{j}/\sigma}}}{1 + e^{2{y_{j}/\sigma}}}}{\sqrt{\sum\limits_{j = 1}^{n}\quad \frac{1 + e^{4{y_{j}/\sigma}}}{\left( {1 + e^{2{y_{j}/\sigma}}} \right)^{2}}}} \geq {\sqrt{n - d + 1} + {o(1)}}$

where σ is the noise variance, d is the designed distance of ′₂(n,k′), and o(1) denotes a function of the total number s of interpolation points that tends to zero as s→∞.

We next consider a binary symmetric channel (BSC) with crossover probability ε. For this channel, ={0,1} and the function μ(y) in (25) is given by ${\mu (y)} = {{f\left( y \middle| 0 \right)} = \left\{ \begin{matrix} {1 - ɛ} & {y = 0} \\ ɛ & {y = 1} \end{matrix} \right.}$

Suppose that a vector y=(y₁,y₂, . . . , y_(n))∈{0,1}^(n) is observed at the output of the BSC. Then the corresponding 2×n reliability matrix from (26) is given by $\begin{matrix} {{\Pi (y)} = \left\lbrack \begin{matrix} {1 - y_{1} + {ɛ\left( {{2y_{1}} - 1} \right)}} & {1 - y_{2} + {ɛ\left( {{2y_{2}} - 1} \right)}} & \cdots & {1 - y_{n} + {ɛ\left( {{2y_{n}} - 1} \right)}} \\ {y_{1} + {ɛ\left( {{2y_{1}} - 1} \right)}} & {y_{2} + {ɛ\left( {{2y_{2}} - 1} \right)}} & \cdots & {y_{n} + {ɛ\left( {{2y_{n}} - 1} \right)}} \end{matrix}\quad \right\rbrack} & (28) \end{matrix}$

For this matrix, we have <Π(y), Π(y)>=n−2nε(1−ε) and <Π(y), [c]>=εt+(1−ε)(n−t), where t is the Hamming distance between y and c. Thus for a binary symmetric channel the asymptotic condition for successful decoding is as follows.

Theorem 8. Suppose that a codeword of an (n,k′,d′) binary BCH code ′₂(n,k′) is transmitted over a BSC with crossover probability e and the vector (y₁,y₂, . . . y_(n))∈{0,1}^(n) ends observed at the channel output. Then the algebraic soft-decision decoder produces a list that contains a codeword C=(c₁,c₂, . . . , c_(n))∈′₂(n,k′) provided $\begin{matrix} {\frac{t}{n} \leq {\frac{\left( {1 - ɛ} \right) - \sqrt{\left( {1 - \frac{d - 1}{n}} \right) - {2\left( {1 - \frac{d - 1}{n}} \right){ɛ\left( {1 - ɛ} \right)}}}}{\left( {1 - ɛ} \right) - ɛ} + {o(1)}}} & (29) \end{matrix}$

where t is the Hamming distance between y and c, d is the designed distance of ′₂(n,k′), and o(1) is a function of the total number s of interpolation points.

Although the binary symmetric channel does not provide “soft” reliability information in the usual sense, it turns out that the soft-decoding techniques developed in this specification make it possible to decode well beyond the Guruswarni-Sudan bound τ≦1−{square root over (R)} (see V. Guruswami, M. Sudan, op. cit.).

While this is already evident from Theorem 8, we can do even better, as described in what follows. First, let us rewrite the condition (29) in the following form $\begin{matrix} {\tau \leq \frac{1 - ɛ - \sqrt{R - {2R\quad {ɛ\left( {1 - ɛ} \right)}}}}{1 - {2\quad ɛ}}} & (30) \end{matrix}$

where _=t/n is the fraction of erroneous positions that can be corrected using our soft-decoding algorithm and R=k/n=1−(d−1)/n is the rate of the underlying Reed-Solomon code (notice that this rate is always higher than the rate R′=k′/n of the BCH code). Next, let us consider the following interpretation of Theorem 8: the algebraic soft-decision decoder, when presented with the reliability matrix Π(y) in (28) corrects a fraction of errors specified by (30). It is clear from this interpretation that one does not have to use the actual crossover probability of the BSC in setting-up the reliability matrix in (28). Rather, we are interested in the value of ε∈[0,0.5] that maximizes the right-hand side of (30) for a fixed R. The optimal value of ε is given by $\begin{matrix} {ɛ^{*}\overset{def}{=}{{\frac{1}{2} - {\frac{1}{2}\sqrt{{2R} - 1}}} = {\frac{1}{2} - {\frac{1}{2}\sqrt{1 - {2{\left( {d - 1} \right)/n}}}}}}} & (31) \end{matrix}$

in which case the right-hand side of (30) is equal to ε*. In other words, if upon observing the vector y∈{0,1}^(n) at the output of the BSC, one constructs the matrix ${\Pi*(y)}\overset{def}{=}\begin{bmatrix} {1 - y_{1} + {ɛ^{*}\left( {{2y_{1}} - 1} \right)}} & {1 - y_{2} + {ɛ^{*}\left( {{2y_{2}} - 1} \right)}} & \cdots & {1 - y_{n} + {ɛ^{*}\left( {{2y_{n}} - 1} \right)}} \\ {y_{1} + {ɛ^{*}\left( {{2y_{1}} - 1} \right)}} & {y_{2} + {ɛ^{*}\left( {{2y_{2}} - 1} \right)}} & \cdots & {y_{n} + {ɛ^{*}\left( {{2y_{n}} - 1} \right)}} \end{bmatrix}$

where ε* is given by (31), then the algebraic soft-decision decoder, operating on Π*(y) as the reliability matrix, will produce a list of all codewords of ′₂(n,k′) at Hamming distance at most τn from y, where $\begin{matrix} {\tau = {{\frac{1}{2} - {\frac{1}{2}\sqrt{{2R} - 1}}} = {\frac{1}{2} - {\frac{1}{2}\sqrt{1 - {2{\left( {d - 1} \right)/n}}}}}}} & (32) \end{matrix}$

We observe that the right-hand side of (32) can be characterized as the (smaller) root of the quadratic equation τ(1−τ)=(d−1)/2n. Thus we have the following theorem.

Theorem 9. Let ′₂(n,k′) be a binary BCH code with designed distance d, let R=k/n be the rate of the underlying Reed-Solomon code, and let t_(BCH)=(d−1)/2. Then algebraic soft-decision list decoding of ′₂(n,k′) on a binary symmetric channel corrects any fraction of τ≦0.5 errors, provided that τ satisfies the following inequality $\begin{matrix} {{{\tau \left( {1 - \tau} \right)} \leq \frac{\quad^{t}{BCH}}{n}} = \frac{1 - R}{2}} & (33) \end{matrix}$

The bound of Theorem 9 makes it possible to directly compare our results with conventional hard-decision decoding (which corrects up to t_(BCH) errors) and with the list-decoding algorithm of Guruswami-Sudan (V. Guruswami, M. Sudan, op. cit.). Note that the Guruswarni-Sudan bound on the fraction of correctable errors can be written as (1−τ)²≧R, whereas (33) can be recast as

(1−τ)²+τ² ≧R  (34)

The corresponding curves are plotted in FIG. 7 as a function of R. We observe that for R≦0.5, the conditions (33) and (34) become vacuous since min_(0≦τ≦1){(1−τ)²+τ²}=0.5.

This is consistent with the fact that for R=k/n≦0.5, the binary BCH code ′₂(n,k′) degenerates to the (n,1,n) repetition code, for which list decoding is trivial.

For any specific length n, one can deduce the rate R′=k′/n of the binary BCH code from the rate R=k/n of the underlying Reed-Solomon code, which makes it possible to re-plot the curves in FIG. 2 as a function of the dimension k′ of ′₂(n,k′). T) his is done in FIGS. 8 and 9 for binary BCH codes of length 63 and of length 127, respectively. Also plotted in FIGS. 8 and 9 for comparison, is the sphere-packing bound ${\log_{2}{\sum\limits_{i = 0}^{t}\quad \begin{pmatrix} n \\ i \end{pmatrix}}} \leq {n - {k^{\prime}.}}$

5. Soft-decision Decoding of Algebraic-geometric Codes

The decoding algorithms of Sudan (V. Guruswami, M. Sudan, op. cit.; and M. Sudan, op. cit.) can be adapted to the general class of algebraic-geometric codes. This was shown by Shokrollahi and Wasserman (M. A. Shokrollahi, H. Wasserman, List decoding of algebraic-geometric codes, IEE Trans. Info. Theory, vol. 45, pp. 432-437, 1999) and by Guruswami and Sudan (V. Guruswami, M. Sudan, op. cit.). The essential ideas of our soft-decision decoding algorithm carry over to algebraic-geometric codes in a similar way. In this section, we give a short description of algebraic-geometric codes and how soft-decision decoding can be applied to them.

5.1. Algebraic-geometric Codes

For simplicity we restrict our attention to codes from plane curves. Let _(q) denote the algebraic closure _(q). Given an irreducible polynomial F(X,Z)∈_(q)[X,Z], a plane irreducible curve X: F(X,Z)=0 is defined as follows $X\overset{def}{=}\left\{ {{\left( {x,z} \right) \in {{\overset{\_}{}}_{q}:{\mathcal{F}\left( {x,z} \right)}}} = 0} \right\}$

We let X_(q)=X∩_(q) ² denote the restriction of X to _(q). We will be interested in the behavior of rational functions on the curve X. Let P=(x,z)∈X be an arbitrary point on X, and let ƒ(X,Z)=g(X,Z)/h(X,Z) be a rational function, where g, h∈_(q)[X,Z]. We write ƒ(P) to denote the evaluation of ƒ(X,Z) at P, namely ƒ(P)=ƒ(x,z). We will be interested only in the vies of ƒ at the points on the curve X. Thus, it is useful to think of ƒ as a mapping ƒ: X_(q)→_(q)∪{∞}. Restricting to X_(q), it is obvious that ƒ: X_(q)→_(q)∪{∞}.

The points P∈X where ƒ(P)=0 are called the zeros of ƒ, and the points P∈X where ƒ(P)=∞ are called the poles of ƒ. See T. Høholdt, J. H. van Lint, R. Pellikaan, Algebraic geometry codes, Chapter 10, in V. S. Pless, W. C. Huffman (Editors), Handbook of Coding Theory, Amsterdam: Elsevier, 1998, p. 880 and H. Stichtenoth, Algebraic Function Fields and Codes, Berlin: Springer-Verlag, 1993, p. 7, for the definition of a multiplicity of a zero (pole) of ƒ. The following fundamental result is well known.

Lemma 10. For any rational function ƒ≠0, the number of zeros of ƒ in _(q) equals the number of poles of ƒ in _(q), counted with proper multiplicities including points at infinity.

Let P be any point on X. Given a rational function of ƒ on X, a number of v_(p)(ƒ) called the valuation of ƒ at P can be defined as follows: ${v_{P}(f)}\overset{def}{=}\left\{ \begin{matrix} \infty & {f = 0} \\ m & {{f\quad {has}\quad a\quad {zero}\quad {of}\quad {multiplicity}\quad m\quad {but}\quad {not}\quad m} + {1\quad {at}\quad P}} \\ {- m} & {{f\quad {has}\quad a\quad {pole}\quad {of}\quad {multiplicity}\quad m\quad {but}\quad {not}\quad m} + {1\quad {at}\quad P}} \end{matrix} \right.$

Next, we fix a point ∈X_(q) and define L(m) as the set of rational functions on X that may contain a pole only at and the multiplicity of this pole is at most m. It is known that L(m) is an _(q)-linear space, of dimension dim L(m). Let $\begin{matrix} {g\overset{def}{=}{{\max\limits_{m \geq 0}\left\{ {m - {\dim \quad {L\left( {m} \right)}}} \right\}} + 1}} & (35) \end{matrix}$

The number g is called the genus of X. It is a consequence of the Riemann-Roch theorem that g is well-defined (see H. Stichtenoth, Algebraic Function Fields and Codes, Berlin: Springer-Verlag, 1993). That is, g is finite and independent of the choice of ∈X_(q).

Let D={P₁, P₂, . . . P_(n)} be a set of n distinct points in X_(q), such that ∉D. We define an algebraic-geometric (AG) code _(X)(m,), in a manner analogous to (1), as follows $\begin{matrix} {{{\mathbb{C}}_{X}\left( {m,} \right)}\overset{def}{=}\left\{ {\left( {{f\left( P_{1} \right)},{f\left( P_{2} \right)},\cdots \quad,{f\left( P_{n} \right)}} \right):{f \in {L\left( {m,} \right)}}} \right\}} & (36) \end{matrix}$

It follows trivially from (36) that _(X)(m,) has length n. For m<n, the dimension of _(X)(m,) is k=dim L(m), and it follows directly from the definition of genus of X in (35) that k≧m−g+1. Moreover, it is well known that k=m−g+1 if 2g−1≦m<n. In most cases of interest, the genus g is small, and therefore the latter condition is not a significant restriction on the rate of the resulting code. Thus, we shall henceforth assume that 2g−1≦m<n always holds. The minimum distance of _(X) (m,) satisfies d≧n−m. This follows immediately from the fact that since ƒ∈L(m,) has at most m poles, it cannot have more than m zeros in D by Lemma 10.

5.2. Algebraic Soft-decision Decoding

Following the work of Shokrollahi-Wasserman (M. A. Shokrollahi, H. Wasserman, op. cit.) and Guruswami-Sudan (V. Guruswami, M. Sudan, op. cit.), we now show how every step in the decoding of Reed-Solomon codes has a clear analogue in the case of algebraic-geometric codes. Note that the first step of our algorithm remains unchanged: given a q×n reliability matrix Π, we compute the corresponding multiplicity matrix M=(Π,s) using Algorithm A, as described in Section 3.

In order to describe the soft interpolation step of our algorithm, we need more notation. In particular, we need to extend the notions of interpolation polynomial, weighted degree and zero of multiplicity m of the interpolation polynomial to the case of algebraic-geometric codes. First, we define a basis for the space L(m). Since dimL(m)=m−g+1, there must exist numbers o₁,o₂, . . . , o_(g) such that dimL(o_(i))=dimL((o_(i)−1)). For a given point , these numbers o₁,o₂, . . . , o_(g) are called the gaps at . For i=0, 1, . . . , m, we fix an arbitrary rational function φ_(i) in L(i)\L(i−1)) if such a function exists, which happens if and only if i∉{o₁,o₂, . . . , o_(g)}. For i∈{o₁,o₂, . . . , o_(g)}, we set φ_(i)=0 by convention. It is obvious by construction that the (nonzero) functions φ₀,φ₁, . . . , φ_(m) are linearly independent over _(q) and span the space L(m). We will expand all rational functions in L(m) in the basis φ₀,φ₁, . . . , φ_(m), namely

ƒ∈L(mQ)⇄ƒ=ƒ₀φ₀+ƒ₁φ₁+ . . . +ƒ_(m)φ_(m) with ƒ₀,ƒ₁, . . . , ƒ_(m)∈_(q)  (37)

As a convention, we set ƒ_(i)=0 for i∈{o₁,o₂, . . . , o_(g)}. Next, we consider the ring K of rational functions that have poles only at , which is given by K  = def  ⋃ ∞ m = 0  L  ( m  )

The interpolation polynomial that one needs to compute at the soft interpolation step turns out to be a polynomial over K . Notice that we can write any polynomial A(Y) in K [Y] in the form ${A(Y)} = {\sum\limits_{i = 0}^{\infty}\quad {\sum\limits_{j = 0}^{\infty}\quad {a_{i,j}\varphi_{i}{Y^{j}.}}}}$

The following definition is the counterpart of Definition 1 for algebraic-geometry codes, where the basis φ₀,φ₁, . . . plays the role of the variable X.

5 Definition 5. Let ${A(Y)} = {\sum\limits_{i = 0}^{\infty}\quad {\sum\limits_{j = 0}^{\infty}\quad {a_{i,j}\varphi_{i}Y^{j}}}}$

be a polynomial over K and let w, w_(Y) be nonnegative real numbers. The (w,w_(Y))-weighted -valuation of A(Y) is defined as the maximum over all numbers iw+jw_(Y) such that a_(i,j)≠0.

It remains to define interpolation points and their multiplicities. As for Reed-Solomon codes, the interpolation points are pairs (P,α), where P_(j) is a point in D and α∈_(q). In order to define the multiplicity of a generic interpolation point (P,α), we need to introduce a new basis for L(m). For each point P≠ on X, we introduce the functions φ_(0,P),φ_(1,P), . . . φ_(m,P)∈L(m) defined as follows. If there exists a function ƒ∈L(m) such that v_(P)(ƒ)=i, then we set φ_(i,P)=ƒ (if there is more than one such function ƒ, we fix one arbitrarily). Otherwise, we set φ_(i,P)=0 by convention. Again, it is a direct consequence of the Riemann-Roch theorem that there exist g numbers o_(0,P),o_(1,P), . . . o_(o,P) called the P-gaps at , such that φ_(o) _(i,P) _(,P)=0. It is known (H. Stichtenoth, Algebraic Function Fields and Codes, Berlin: Springer-Verlag, 1993) that the functions φ_(0,P),φ_(1,P), . . . φ_(m,P) form a basis for L(m) and we can write any ƒ∈L(m), by a simple change of basis, as follows

ƒ=ƒ_(0,P)φ_(0,P)+ƒ_(1,P)φ_(1,P)+ . . . +ƒ_(m,P)φ_(m,P) with ƒ_(0,P),ƒ_(1,P), . . . ƒ_(m,P)∈_(q)  (38)

where ƒ_(i,P)=0 for i∈{o_(1,P,o) _(2,P), . . . , o_(g,P)} by convention. Since (38) holds for all m≧0, the above clearly extends to a basis φ_(0,P),φ_(1,P), . . . for the ring K [Y]. The following definition is the counterpart of Definition 2 for algebraic-geometric codes.

Definition 6. Let A(Y) be a polynomial in K [Y], and consider the shifted polynomial B(Y)=A(Y+α) expressed in the basis φ_(0,P),φ_(1,P), . . . , namely $\begin{matrix} {{B(Y)} = {\sum\limits_{i = 0}^{\infty}\quad {\sum\limits_{j = 0}^{\infty}\quad {b_{i,j}\phi_{i,P}Y^{j}}}}} & (39) \end{matrix}$

We say A(Y) has a zero of multiplicity m at the interpolation point (P,α), if b_(i,j)=0 for i+j<m and there exists a nonzero coefficient b_(i,j) with i+j=m.

It follows from (39) that the change of basis from φ₀,φ₂, . . . to φ_(0,P)φ_(1,P), . . . plays the role of shift in the variable X. We are now ready to describe the second step of our algorithm.

Soft interpolation step: Given the point set D = {P₁,P₂,...,P_(n)} for the code _(X)(m,) and the multiplicity matrix M = [m_(ij)], compute the (nontrivial) polynomial Q_(M)(Y) ε K[Y] of minimal (1,m)-weighted -valuation that has a zero of multiplicity at least m_(ij) at the interpolation point (P_(j),α_(i)) for every i,j such that m_(ij) ≠ 0.

Let ƒ(P₁)ƒ(P₂), . . . ƒ(P_(n))) be a codeword in _(X)(m,). As in the case of Reed-Solomon codes, we associate with each such codeword an expression of the type Y−ƒ, where ƒ∈L(m). The third and final step of our algorithm thus consists of the following.

Factorization step: Given the polynomial Q_(M)(Y) ε K[Y], identify all the factors of Q_(M)(Y) of type Y − f with fε L(m). The output of the algorithm is a list of the codewords that correspond to these factors.

Note that the factorization above takes place in the ring K[Y], and not over _(q)[X,Y]. In fact, the polynomial _(M)(Y) usually will not have a linear factor (in Y) over _(q)[X,Y].

6. Performance Improvements for Certain Channels

We have observed in Section 4, that it is possible to improve the asymptotic (for s→∞) performance of our soft-decision decoding algorithm on a binary symmetric channel by assuming a critical crossover probability ε* that is different from the actual crossover probability of the BSC. In this section, we explore this possibility further for certain important classes of communication channels.

All channels considered in this section are discrete, meaning that the output alphabet is discrete and finite. Such discrete channels can be conveniently characterized by an × transition-probability matrix W_(y|x) whose rows and columns are indexed by x∈ and y∈, respectively, and whose entries are ${W_{y|x}\left( {x,y} \right)}\overset{def}{=}{{\Pr \left( {\mathrm{\Upsilon} = {\left. y \middle| \chi \right. = x}} \right)} = {f\left( y \middle| x \right)}}$

where the ƒ(·|x) are transition-probability mass functions, as defied in (7). The specific channels investigated in this section are: the q-ary symmetric channel, the q-ary symmetric channel with erasures, and a simplified version of the q-ary PSK channel. All these channels are, in fact, hard-decision channels, but the decoding method of the present invention can be used on such channels as well. The necessary modifications to the method are taught in this section.

In each case, our goal is to maximize the set of correctable error patterns, for a code of a given rate. In each case, we find that to achieve this goal, we have to disregard the actual channel parameters, and use instead parameters derived from the rate of the code at hand and from the properties of the decoding algorithm. Thus, the resulting solutions are ideally suited to the frequently encountered situation where the channel parameters are not known, or not known exactly. This is somewhat related to the recent work on universal decoding, or decoding under channel mismatch (A. Lapidoth, P. Narayan, Reliable communication under channel uncertainty, IEEE Trans. Info. Theory, vol. 44, pp. 2148-2177, 1998). We believe that similar strategies for improving the asymptotic performance of our soft-decoding algorithm can be developed for any discrete channel.

6.1. The g-ary Symmetric Channel

Consider a q-ary symmetric channel with parameter ε. For this channel, ==_(q) and the transition-probability matrix is given by ${W_{y|x}\left( {x,y} \right)} = \left\{ \begin{matrix} {1 - ɛ} & {y = x} \\ \frac{ɛ}{q - 1} & {y \neq x} \end{matrix} \right.$

Suppose that a codeword c∈_(q) ^(n) of a Reed-Solomon (or BCH) code _(q)(n,k) is transmitted over a q-ary symmetric channel, and a vector y∈_(q) ^(n) is observed at the channel output. Then the corresponding reliability matrix is given by $\begin{matrix} {\Pi = {{\left( {1 - ɛ} \right)\lbrack y\rbrack} + {\frac{ɛ}{q - 1}\left( {1 - \lbrack y\rbrack} \right)}}} & (40) \end{matrix}$

By symmetry, the multiplicity matrix M that maximizes the fraction of correctable errors will have only two different values, say a and b. Explicitly, we have M=a[y]+b (1−[y]). The cost of M is thus (M)=na(a+1)/2+n(q−1)b(b+1)/2. It t errors occur during the transmission, then the score of the transmitted codeword is given by S_(M)(c)=a(n−t)+bt. Thus, the condition S_(M)(c)>{square root over (2k)} of Corollary 5 is satisfied if $\begin{matrix} {\tau < \frac{a - \sqrt{{{Ra}\left( {a + 1} \right)} + {{R\left( {q - 1} \right)}{b\left( {b + 1} \right)}}}}{a - b}} & (41) \end{matrix}$

where τ=t/n and R=k/n. It follows that to maximize the fraction of correctable errors, one should choose a and b so as to maximize the right-hand side of (41), subject to a constraint on the normalized cost ′(M)=2(M)/n (which determines the complexity of decoding). While, in general, this appears to be a difficult nonlinear optimization problem, there is a simple solution for the asymptotic case, when the total number of interpolation points s=na+n(q−1)b tends to infinity. For large numbers a and b, the normalized cost is well approximated by a²+(q−1)b². Let b′=b{square root over (g−1)}. Then the condition for successful decoding can be rewritten as $\begin{matrix} {\frac{{a\left( {1 - \tau} \right)} + {b\quad \tau}}{\sqrt{a^{2} + {\left( {q - 1} \right)b^{2}}}} = {\frac{{a\left( {1 - \tau} \right)} + {\frac{\tau}{\sqrt{q - 1}}b^{\prime}}}{\sqrt{a^{2} + b^{\prime 2}}} \geq \sqrt{R}}} & (42) \end{matrix}$

It is easy to see that for a fixed τ, the left-hand side of (42) is maximized if (a,b) is a multiple of (1−τ, τ/q−1). For these values of a and b, the inequality (42) reduces to the following $\begin{matrix} {{\left( {1 - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \geq R} & (43) \end{matrix}$

Given a Reed-Solomon code of rate R, we first determine the maximal τ such that (43) is satisfied. This value of τ then determines the values of a=γ(1−τ) and b=γτ/(q−1), where the scaling constant γ is chosen so that a and b are positive integers (such a constant exists if τ is rational). With this choice of a and b, our soft-decoding algorithm, operating on the multiplicity matrix M=a[y]+b(1−[y]), will produce a list that contains the transmitted codeword c provided that at most τn error occur during the transmission for some τ that satisfies (43).

The bound (43) is stronger than the results reported by Guruswami and Sudan in V. Guruswami, M. Sudan, op. cit. The difference τ²/(q−1) with respect to the bound (1−τ)²≧R derived in V. Guruswami, M. Sudan, op. cit. is especially significant for small alphabets. In fact (43) reduces to (34) for q=2; the resulting improvement over the Guruswami-Sudan list decoding is illustrated in FIG. 7.

We observe that the bound (43) can also be derived using an equivalent, but different approach. This approach is described in what follows, since it will be useful in subsequent subsections. We consider directly the asymptotic case s→∞, and employ Theorem A.5. For the reliability matrix in (40), we have ${\langle{\Pi,\Pi}\rangle} = {\frac{n\quad ɛ^{2}}{q - 1} + \left( {n\left( {1 - ɛ} \right)} \right)^{2}}$ ${\langle{\Pi,\lbrack c\rbrack}\rangle} = {{\left( {1 - ɛ} \right)\left( {n - t} \right)} + \frac{t\quad ɛ}{q - 1}}$

It follows that on a q-ary symmetric channel, the asymptotic condition for successful decoding in Theorem A.5 reduces to the following $\begin{matrix} {\frac{{\frac{\tau}{\sqrt{q - 1}}\frac{ɛ}{\sqrt{q - 1}}} + {\left( {1 - \tau} \right)\left( {1 - ɛ} \right)}}{\left( {\frac{ɛ^{2}}{q - 1} + \left( {1 - ɛ} \right)^{2}} \right)^{1/2}} \geq {\sqrt{R} + {o(1)}}} & (44) \end{matrix}$

Recall that our goal is to maximize τ for a fixed code rate R. The key idea in this context is to regard the channel parameter ε not as given, but as a variable. Indeed, if we vary ε in (40) and use the resulting matrix as the “reliability” matrix at the input to our soft-decoding algorithm, then the condition for successful decoding in (44) remains valid and varies accordingly. Hence, we should choose ε to maximize the left-hand side of (44) for a given rate R. Intuitively, this corresponds to maximizing the expected score with respect to the “worst” channel (or, equivalently, the largest ε) that we can still handle for a given rate. A straightforward calculation now shows that the left-hand side of (44) is maximized for ε=τ. For this value of ε(63), is precisely the same condition on τ as (43).

We note that a similar analysis can be carried out for algebraic-geometric codes. Although we will not mention AG codes again in this section, this remark applies to the subsequent subsections as well.

6.2. The q-ary Symmetric Erasure Channel

For a q-ary symmetric erasure channel, we have =_(q) and =_(q)∪{φ}, where the special symbol φ denotes erasure. The transition-probability matrix is given by ${W_{y|x}\left( {x,y} \right)} = \left\{ \begin{matrix} \zeta & {y = \varphi} \\ {\left( {1 - \zeta} \right)\left( {1 - ɛ} \right)} & {y = x} \\ {\left( {1 - \zeta} \right)\frac{ɛ}{q - 1}} & {y \neq {x\quad {and}\quad y} \neq \varphi} \end{matrix} \right.$

where ε and ζ denote the error probability and the erasure probability, respectively. Suppose that a codeword c of a Reed-Solomon (or BCH) code _(q)(n,k) is transmitted over a q-ary symmetric erasure channel and a vector y=(y₁,y₂, . . . , y_(n)) over _(q)∪{φ} is observed at the channel output. It will be convenient to introduce the q×n real-valued matrix {y} defined as follows: {y}_(i,j)=1 if y_(j)=α_(i), {y}_(i,j)=1/q if y_(j)=φ, and {y}_(i,j)=0 otherwise. With this notation, the corresponding reliability matrix is given by $\begin{matrix} {\Pi = {{\left( {1 - ɛ} \right)\left\{ y \right\}} + {\frac{ɛ}{q - 1}\left( {1 - \left\{ y \right\}} \right)}}} & (45) \end{matrix}$

Using the approach described at the end of Section 6.1, we consider the asymptotic case, when the total number of interpolation points tends to infinity, and employ Theorem A.5. Suppose that τn errors and σn erasures occurred during the transmission. Then ${\langle{\Pi,\Pi}\rangle} = {{{n\left( {1 - \sigma} \right)}\left( {\left( {1 - ɛ} \right)^{2} + \frac{ɛ^{2}}{q - 1}} \right)} + \frac{n\quad \sigma}{q}}$ ${\langle{\Pi,\lbrack c\rbrack}\rangle} = {{{n\left( {1 - \sigma - \tau} \right)}\left( {1 - ɛ} \right)} + {n\quad \tau \frac{ɛ}{q - 1}} + \frac{n\quad \sigma}{q}}$

It follows that on a q-ary symmetric erasure channel, the asymptotic condition for successful decoding in Theorem A.5 reduces the following $\begin{matrix} {\frac{{\left( {1 - \sigma - \tau} \right)\left( {1 - ɛ} \right)} + \frac{\tau \quad ɛ}{q - 1} + \frac{\sigma}{q}}{\left( {{\left( {1 - \sigma} \right)\left( {\left( {1 - ɛ} \right)^{2} + \frac{ɛ^{2}}{q - 1}} \right)} + \frac{\sigma}{q}} \right)^{1/2}} \geq {\sqrt{R} + {o(1)}}} & (46) \end{matrix}$

It is easy to see that the left-hand side of (46) is maximized for ε=τ/(1−σ). For this value of ε, the inequality (46) reduces to the following bound $\begin{matrix} {{{\frac{1}{1 - \sigma}\left( {\left( {1 - \sigma - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \right)} + \frac{\sigma}{q}} \geq R} & (47) \end{matrix}$

The bound (47) describes the asymptotic performance of the following algorithm. Upon observing y at the output of a q-ary symmetric erasure channel, we count the number σn of erasures. Given q and R (determined by the code) and the observed value of σ, we solve for the largest τ that satisfies (47). Next, we set ε=τ/(1−σ) in (45) and use the resulting matrix Π as the “reliability” matrix at the input to our soft-decoding algorithm. Then, as the number s of interpolation points approaches infinity, the algorithm corrects any fraction of τ errors and σ erasures, provided that τ and σ satisfy (47).

6.3. A Simplified g-PSK Channel

We consider a simplified model of the q-ary phase-shift keying (q-PSK) channel. Namely, we distinguish only between errors to the nearest neighbors in the q-PSK constellation and errors beyond the nearest neighbors. This model is usually quite adequate in practice.

For a q-PSK channel, both the input alphabet and the output alphabet have q letters and it will be convenient to identify both alphabets with _(q), the ring of integers modulo q. Thus, =={0,1, . . . q−1}, with addition modulo q. (Of course, for the purposes of algebraic decoding, we can still identify and with _(q).) The transition-probability matrix for our model is given by ${W_{y|x}\left( {x,y} \right)} = \left\{ \begin{matrix} {1 - ɛ - \rho} & {y - x} \\ {\rho/2} & {y = {{x + {1\quad {or}\quad y}} = {x - 1}}} \\ \frac{ɛ}{q - 3} & {y \notin \left\{ {{x - 1},x,{x + 1}} \right\}} \end{matrix} \right.$

where ρ is the probability of error to one of the two nearest neighbors and ε is the probability of error beyond the nearest neighbors. If a codeword c∈_(q)(n,k) is transmitted and a vector y∈_(q) ^(n) is observed at the channel output, the reliability matrix is given by $\begin{matrix} {\Pi = {{\left( {1 - ɛ - \rho} \right)\left\{ y \right\}} + {\frac{\rho}{2}\left( {\left\lbrack {y + \underset{\_}{1}} \right\rbrack + \left\lbrack {y - \underset{\_}{1}} \right\rbrack} \right)} + {\frac{ɛ}{q - 3}\left( {1 - \left\lbrack {y + \underset{\_}{1}} \right\rbrack - \left\{ y \right\} - \left\lbrack {y + \underset{\_}{1}} \right\rbrack} \right)}}} & (48) \end{matrix}$

where 1 denotes the all-one vector of length n, and the notation [·] is extended to vectors over _(q) in the obvious way. Suppose that σn errors to the nearest neighbors and τn errors beyond the nearest neighbors have occurred during the transmission. Then ${\langle{\Pi,\Pi}\rangle} = {n\left( {\left( {1 - ɛ - \rho} \right)^{2} + \frac{\rho^{2}}{2} + \frac{ɛ^{2}}{q - 3}} \right)}$ ${\langle{\Pi,\lbrack c\rbrack}\rangle} = {{{n\left( {1 - \tau - \sigma} \right)}\left( {1 - ɛ - \rho} \right)} + \frac{n\quad \sigma \quad \rho}{2} + \frac{n\quad \tau \quad ɛ}{q - 3}}$

It follows that for (our model of) the q-PSK channel, the asymptotic condition for successful decoding in Theorem A.5 reduces to the following $\begin{matrix} {\frac{{\left( {1 - \tau - \sigma} \right)\left( {1 - ɛ - \rho} \right)} + \frac{\tau \quad ɛ}{q - 3} + \frac{\sigma \quad \rho}{2}}{\left( {\left( {1 - ɛ - \rho} \right)^{2} + \frac{ɛ^{2}}{q - 3} + \frac{\rho^{2}}{2}} \right)^{1/2}} \geq {\sqrt{R} + {o(1)}}} & (49) \end{matrix}$

It is easy to see that the left-hand side of (49) is maximized for ε=τ and ρ=σ. For these values of ε and ρ, the inequality (49) reduces to the following bound $\begin{matrix} {{\left( {1 - \tau - \sigma} \right)^{2} + \frac{\tau^{2}}{q - 3} + \frac{\sigma^{2}}{2}} \geq R} & (50) \end{matrix}$

However, it is not immediately clear what this bound describes. If the actual value of τ were known at the decoder, we could proceed as follows. Given τ,q, and R, solve for the largest σ that satisfies (50). Then set ε=τ and ρ=σ in (48) and use the resulting matrix Π as the “reliability” matrix at the input to our soft-decoding algorithm. The algorithm then corrects (for s→∞) any fraction of σ errors to the nearest neighbors and τ errors beyond the nearest neighbors, provided that σ and τ satisfy (50). The problem is that the number τn of errors beyond the nearest neighbors that actually occurred is generally not known at the decoder. A simple work-around in this situation is to try all the n possible values of τ. Clearly, this strategy increases the overall decoding complexity by at most a linear factor. Moreover, in practice, it would be safe to assume that τ is small (τ≧2ε, say, is extremely unlikely), so that the complexity increase is by a factor much less than n.

APPENDIX A Asymptotic Analysis

In this appendix, we investigate the multiplicity matrix (Π,s) produced by Algorithm A as s→∞. We shall see that for s→∞ the matrix (Π,s) becomes proportional to Π. Based on this observation, we derive an asymptotic condition for successful list-decoding, and provide a geometric characterization of the asymptotic decoding regions of the algorithm described in this specification.

We start with two simple lemmas. In all of the subsequent analysis, we keep the reliability matrix Π=[π_(i,j)] fixed, while s ranges over the positive integers. For notational convenience, we define Φ={1,2, . . . , q}×{1,2, . . . , n}. Let χ(Π) denote the support of Π, that is the set of all (i,j)∈Φ such that π_(i,j)≠0. Let m_(i,j)(s) denote the entries in the matrix (Π,s) produced by Algorithm A, put forth in Section 3 of the Description of Preferred Embodiment part of this specification.

Lemma A.1. As s→∞, every nonzero entry in (Π,s) grows without bound. In other words m_(i,j)(s)→∞ when s→∞ for all (i,j)∈χ(Π).

Consider now the proof of Lemma A.1:

Define m_(max)(s)=max_((i,j)∈Φ)m_(i,j)(s) and m_(min)(s)=min_((i,j)∈χ(Π))m_(i,j)(s). Clearly, it would suffice to show that m_(min)(s)→∞ as s→∞. Notice that $\begin{matrix} {s = {{\sum\limits_{i = 1}^{q}\quad {\sum\limits_{j = 1}^{n}\quad {m_{i,j}(s)}}} \leq {{qn}\quad {m_{\max}(s)}}}} & \text{(A.1)} \end{matrix}$

It follows from (A.1) that m_(max)(s)→∞ as s→∞. Hence there exists an infinite integer sequence s₁,s₂,s₃, . . . defined by the property that m_(max)(s_(r))=r and m_(max)(s_(r)+1)=r+1. The iterative nature of Algorithm A implies that for all s≧1, there is exactly one position (i₀,j₀) such that m_(i) ₀ _(,j) ₀ (s+1)=m_(i) ₀ _(,j) ₀ (s)+1. We say that (i₀,j₀) is the position updated at iteration s+1 of Algorithm A. This position is distinguished by the property that $\begin{matrix} {{\frac{\pi_{i_{0}j_{0}}}{{m_{i_{0}j_{0}}(s)} + 1} \geq {\frac{\pi_{ij}}{{m_{ij}(s)} + 1}\quad {for}\quad {all}\quad ({ij})}} \in \Phi} & \text{(A.2)} \end{matrix}$

For r=1,2, . . . , let (i_(r),j_(r)) denote the position updated at iteration s_(r)+1 of Algorithm A. Then it follows from (A.2) and the definition of s_(r) that ${{{m_{ij}\left( s_{r} \right)} + 1} \geq {\frac{\pi_{ij}}{\pi_{i_{r}j_{r}}}\left( {{m_{\max}\left( s_{r} \right)} + 1} \right)} \geq {{\frac{\pi_{\min}}{\pi_{\max}}r} + {\frac{\pi_{\min}}{\pi_{\max}}\quad {for}\quad {all}\quad ({ij})}}} \in {\chi \quad (\Pi)}$

where π_(max)=max_((i,j)∈Φ)π_(i,j) and π_(min)=min_((i,j)∈χ(Π))π_(i,j). Denoting by ρ the ratio π_(min)/π_(max), we conclude from the above that m_(min)(s_(r))≧ρr+ρ−1. Since ρ is a positive constant while r→∞ as s→∞, it follows that m_(min)(s) grows without bound for s→∞. This completes the proof of Lemma A.1.

Henceforth, let (i_(s),j_(s)) denote the position updated at iteration s of Algorithm A, and consider the sequence of ratios of the increase in the expected score to the increase in cost at successive iterations of Algorithm A, namely ${\theta_{1}\overset{def}{=}\frac{\pi_{i_{1}j_{1}}}{m_{i_{1}j_{1}}(1)}},{\theta_{2}\overset{def}{=}\frac{\pi_{i_{2}j_{2}}}{m_{i_{2}j_{2}}(2)}},{\theta_{3}\overset{def}{=}\frac{\pi_{i_{3}j_{3}}}{m_{i_{3}j_{3}}(2)}},\cdots$

It follows from (A.2) that the sequence θ₁,θ₂, . . . is non-increasing. Clearly, θ₁=π_(max) while lim_(s→∞)θ_(s)=0 by Lemma A.1.

Lemma A.2. For every positive integer s, there exists a positive constant K=K(s)≦π_(max), such that

K(m _(i,j)(s)+1)≧π_(i,j) ≧Km _(i,j)(s) for all (i,j)∈Φ  (A.3)

Conversely, for every positive constant K≦π_(max) there exists a positive integer s=s(K) such that the inequality (A.3) holds

Consider now the proof of Lemma A.2:

Given s, we chose K=K(s) so that θ_(s+1)≦K≦θ_(s), which is always possible as the sequence θ₁,θ₂, . . . is non-increasing. To prove the first inequality in (26), observe that ${K \geq \theta_{s + 1}} = {\frac{\pi_{i_{s + 1}j_{s + 1}}}{m_{i_{s + 1}j_{s + 1}}\left( {s + 1} \right)} = {{\frac{\pi_{i_{s + 1}j_{s + 1}}}{{m_{i_{s + 1}j_{s + 1}}(s)} + 1} \geq {\frac{\pi_{i,j}}{{m_{i,j}(s)} + 1}\quad {for}\quad {all}\quad ({ij})}} \in \Phi}}$

where the last inequality follows from (A.2). The second inequality in (A.3) holds vacuously if m_(i,j)(s)=0, so assume that m_(i,j)(s)≧1. This assumption implies that position (i,j) was updated at least once, and we let s*≦s denote the number of the most recent iteration of Algorithm A at which position (i,j) was updated. Then ${K \leq \theta_{s} \leq \theta_{s^{*}}} = {\frac{\pi_{i,j}}{m_{i,j}\left( s^{*} \right)} = \frac{\pi_{i,j}}{m_{i,j}(s)}}$

where the last equality follows from the fact that position (i,j) was not updated since iteration s*. Finally, given 0≦K_(s)≦π_(max), we chose s=s(K) so that θ_(s+1)≦K≦θ_(s) once again. This choice is possible because the sequence θ₁,θ₂, . . . is non-increasing, θ₁=π_(max) and lim_(s→∞)θ_(s)=0. The proof then remains exactly the same. This completes the proof of Lemma A.2.

Since (A.3) holds for all (i,j)∈Φ, both inequalities in (A.3) remain valid under summation over all (i,j)∈Φ. Thus it follows from Lemma A.2 that: $\begin{matrix} {{\sum\limits_{{({i,j})} \in \Phi}{K\left( {{m_{ij}(s)} + 1} \right)}} \geq {\sum\limits_{{({i,j})} \in \Phi}\quad \pi_{ij}} \geq {\sum\limits_{{({i,j})} \in \Phi}\quad {{Km}_{i,j}(s)}}} & \text{(A.4)} \end{matrix}$

These inequalities lead to upper and lower bounds on the constant K=K(s) in Lemma A.2. Since Σ_((i,j)∈Φ)m_(i,j)(s)=s while Σ_((i,j)∈Φ)π_(i,j)=n, we conclude from (A.4) that $\begin{matrix} {\frac{n}{s} \geq {K(s)} \geq \frac{n}{s + {qn}}} & \text{(A.5)} \end{matrix}$

Next, we define the normalized multiplicity matrix ′(Π,s)=[μ_(i,j)(s)] and the normalized reliability matrix Π′=[π′_(i,j)] as follows: μ_(i,j)(s)=m_(i,j)(s)/s and π′_(i,j)=π_(i,j)/n for all (i,j)∈Φ. It is clear from these definitions that <′,1>=<Π′,1>=1, where 1 denotes the all-one matrix. The following theorem is the key result of this appendix: the theorem shows that the optimal multiplicity matrix (Π,s) becomes proportional to Π as s→∞.

Theorem A.3. As s→∞, the normalized multiplicity matrix converges to the normalized reliability matrix. In other words, for every ε>0, there exists an s₀ such that for all s≧s₀ we have $\begin{matrix} {{{\pi_{i,j}^{\prime} - {\mu_{i,j}(s)}}} = {{{{\frac{\pi_{i,j}^{\prime}}{n} - \frac{m_{i,j}(s)}{s}}} \leq {ɛ\quad {for}\quad {all}\quad \left( {i,j} \right)}} \in \Phi}} & \left( {A{.6}} \right) \end{matrix}$

Consider now the proof for Theorem A.3:

It follows from Lemma A.2 that for all s, there exists a constant K(s) such that 1≧π_(i,j)/K(s)−m_(i,j)(s)≧0 for all (i,j)∈Φ. Dividing this inequality by s, we obtain $\begin{matrix} {\frac{1}{s} \geq {\frac{\pi_{i,j}}{{sK}(s)} - {\mu_{i,j}(s)}} \geq 0} & \text{(A.7)} \end{matrix}$

From the bounds on K(s) in (A.5), we conclude that π′_(i,j)≦π_(i,j)/sK(s)≦π′_(i,j)+qπ_(i,j)/s. Combining this with (A.7), we get $\begin{matrix} {\frac{1}{s} \geq {\pi_{i,j}^{\prime} - {\mu_{i,j}(s)}} \geq {- \frac{q\quad \pi_{\max}}{s}}} & \left( {A{.8}} \right) \end{matrix}$

It follows that for all s≧max{1/ε, π_(max)q/ε}=π_(max)q/ε, the bound in (A.6) holds for all (i,j)∈Φ. Thus s₀=π_(max)q/ε. This completes the proof of Theorem A.3.

We conclude this appendix with a geometric characterization of the (asymptotic) decoding regions of our soft-decision decoding algorithm. To start with, consider Lemma A.4.

Lemma A.4. For a given multiplicity matrix M, the algebraic soft-decision decoding algorithm outputs a list that contains a codeword c∈_(q)(n,k) if $\begin{matrix} {\frac{\langle{M,\lbrack c\rbrack}\rangle}{\sqrt{{\langle{M,M}\rangle} + {\langle{M,1}\rangle}}} > \sqrt{k}} & \text{(A.9)} \end{matrix}$

Theorem A.3 and Lemma A.4 lead to a precise characterization of the performance limits of our algorithm as the number of interpolation points approaches infinity. In the following theorem, o(1) denotes a function of s that tends to zero as s→∞. Consider now Theorem A.5:

Theorem A.5. The algebraic soft-decision decoding algorithm outputs a list that contains a codeword c∈_(q)(n,k) if $\begin{matrix} {\frac{\langle{\Pi,\lbrack c\rbrack}\rangle}{\sqrt{\langle{\Pi,\Pi}\rangle}} \geq {\sqrt{k} + {o(1)}}} & \text{(A.10)} \end{matrix}$

Consider now the proof for Theorem A.5:

Substituting the optimal multiplicity matrix (Π,s) in (A.9) and normalizing (dividing by s the numerator and the denominator), we obtain the equivalent condition $\begin{matrix} {\frac{\langle{{M^{\prime}\left( {\Pi,s} \right)},\lbrack c\rbrack}\rangle}{\sqrt{{\langle{{M^{\prime}\left( {\Pi,s} \right)},{M^{\prime}\left( {\Pi,s} \right)}}\rangle} + \frac{1}{s}}} > \sqrt{k}} & \text{(A.11)} \end{matrix}$

It follows from Theorem A.3 that for s→∞, one can replace (Π,s) in (A.11) by Π′, which upon re-normalization yields (A.10). More explicitly, we have ${\frac{\langle{{M^{\prime}\left( {\Pi,s} \right)},\lbrack c\rbrack}\rangle}{\sqrt{{\langle{{M^{\prime}\left( {\Pi,s} \right)},{M^{\prime}\left( {\Pi,s} \right)}}\rangle} + \frac{1}{s}}} \geq \frac{{\langle{\Pi,\lbrack c\rbrack}\rangle} - \frac{n^{2}}{s}}{\sqrt{{\langle{\Pi,\Pi}\rangle} + \frac{\left( {{2q\quad \pi_{\max}} + 1} \right)n^{2}}{s} + \frac{q^{3}n^{3}\pi_{\max}^{2}}{s^{2}}}}} = {\frac{\langle{\Pi,\lbrack c\rbrack}\rangle}{\sqrt{\langle{\Pi,\Pi}\rangle}} + {o(1)}}$

where the first inequality follows from (A.8) after some straightforward manipulations. In conjunction with (A.11), this completes the proof of Theorem A.5.

Finally, Theorem A.5 has a particularly nice interpretation if the reliability matrix Π and the codeword [c] are viewed as vectors in the qn-dimensional Euclidean space ^(qn).

 cos β≧{square root over (R)}+o(1)

Thus the asymptotic decoding regions of our algorithm are spherical cones in the Euclidean space ^(qn), extending from the origin to the surface of a sphere of radius {square root over (n)}. The codeword [c] is a point of , and the line connecting the origin to this point constitutes the central axis of the spherical cone. The central angle of each spherical cone is cos⁻¹{square root over (R)}. Notice that since the algorithm is a list-decoding algorithm, its decoding regions are not disjoint: the spherical cones of angle cos⁻¹{square root over (R)} are overlapping.

It follows from Theorem 2 that the asymptotic (for m→∞) decoding regions of the Guruswami-Sudan algorithm are spherical caps on the surface of of the same spherical angle cos⁻¹{square root over (R)}, but the decoding process involves projecting Π onto a point [y] on the surface of in a nonlinear fashion, according to equation (10). Finally, the decoding regions of conventional Berlekamp-Welch hard-decision decoding are also spherical caps on the surface of and the same nonlinear projection is employed, but the spherical angle of these caps is cos⁻¹(½+½R).

APPENDIX B Exemplary Programmed Implementation of the Method of the Present Invention

The method of the present invention for the algebraic soft-decision decoding of codes of the Reed-Solomon (RS) and Bose-Chaudhuri-Hocquenghem (BCH) types may be implemented in special purpose digital logic or as a software program running on a general purpose microprocessor, or digital computer. The following is an exemplary program, written in the C programming language, implementing the method of the present invention.

It will be understood by practitioner of the programming arts that the program could equally as well have been written in many other programming languages, and that variations in the code and even the structure of the program are possible nonetheless to realizing the same purpose, and the same method. In the event of any inconsistencies between the performance of the program and the method of the invention as taught by reference to the mathematical process thereof within the Detailed Description of the Preferred Embodiment section of this specification, the description within the Detailed Description of the Preferred Embodiment shall control.

#include <stdio.h> #include <stdlib.h> #include <math.h> #define q 64 #define q2 8 #define max_grad 100 #define pi 3.141592654 #define st dummy=getc(stdin) typedef struct{ double p; int x; int y; int mult; } recc; int k; int s; int ls; int rs; int no_cons; int no_tests; int no_right_soft; int no_right_hard; int error_weight; int ssudanscore; float dbb; double sigma; double sp[q][2]; double M[q][q]; int A[q][q]; int cword[q]; int hard_dec[q]; int no_errors; int soft_sudan_score; recc AA[q*q]; void init(); double sq(double x); void random_word(void); int choose_int_points(int 11); static int reccompare(recc *a,recc *b); char dummy; /******************************************* *   main procedure             * *******************************************/ main() { int i,j,test,a,1; printf(“\n Code length is: %d”,q); printf(“\n Enter db and hit the return key: ”); scanf(“%f”,&dbb); printf(“\n Enter dimension k and hit the return key: ”); scanf(“%d”,&k); printf(“\n Enter multiplicity s and hit the return key: ”); scanf(“%d”,&s); sigma=exp(-dbb/20.0);printf(“\n sigma:%f”,sigma); init(); srand(109); no_right_soft=0; no_right_hard=0; for (i=1;i<100000;i++) {random_word(); qsort((recc *) AA ,q*q,sizeof(recc), reccompare); ssudanscore=choose_int_points(no_cons); if (ssudanscore>ls) no_right_soft++; if (error_weight<=ceil(q-(float)ls/s-1)) no_right_hard++; if (i%1==0) printf(“ Score: %d, necessary score: %d”,ssudanscore,ls); st;} } /****************main ends ***************/ /******************************************* *   init procedure              * *******************************************/ void init() { int ii,jj; double ss; /*determination of ls, no_cons etc.*/ for (ii=0;((k-1)*((ii-1)*ii))<(q*(s*(s+1)));ii++); ii=ii-1; rs=ii; ls=floor(q*{(float)s*((float)s+1))/2.0/rs+((float)rs-1.0)*(k-1.0)/2); printf(“\minimum success score: %d”,ls); printf(“\nerror correction (Sudan's algorithm)<=%f”,ceil(q-(float) ls/s-1)); no_cons=q*(s*(s+1))/2; printf(“\nnumber of linear constraints: %d”,no_cons); /* determination of signalpoints*/ ss=0; /*ss is the average energy*/ for (ii=0;ii<q2;ii++) for (jj=0;jj<q2;jj++) { sp[ii*q2+jj][0]=ii-q2/2+0.5; sp[ii*q2+jj][1]=jj-q2/2+0.5; ss=ss+sq(sp[ii*q2+jj][0])+sq(sp[ii*q2+jj][1]); } ss=ss/q; for (ii=0;ii<q;ii++) { if ((ii % q2)==0) printf(“\n”); sp[ii][0]=sp[ii][0]/sqrt(ss)/sigma; sp[ii][1]=sp[ii][1]/sqrt(ss)/sigma; } st; } /************init ends*********************/ /******************************************* *   random word procedure          * *******************************************/ void random_word(void) { int i,j,iii; double u1,u2,mx,sss; double pr; double dist[q]; double rr[2]; error_weight=0; for (i=0;i<q;i++) { cword[i]=36;  /*rand() % q;*/ u1=(double)rand()/(1<<15); u2=(double)rand()/(1<<15); rr[0]=sp[cword[i]][0]+sqrt(-2*log(u2))*cos(2*pi*u1); rr[1]=sp[cword[i]][1]+sqrt(-2*log(u2))*sin(2*pi*u1); pr=0; mx=0; for (j=0;j<q;j++) { sss=(sq(sp[j][0]-rr[0])+sq(sp[j][1]-rr[1]))/2; dist[j]=exp(-sss); if (dist[j]>mx) { mx=dist[j]; hard_dec[i]=j; } pr=pr+dist[j]; } if (hard_dec[i]!=cword[i]) error_weight++; sss=0; for (j=0;j<q;j++) { M[i][j]=dist[j]/pr;sss=M[i][j]+sss; iii=i*q+j; AA[iii].p=M[i][j];AA[iii].x=i;AA[iii].y=j;AA[iii].mult=0; } } }; /*******random word ends*******************/ /******************************************* *  choose_int_points procedure        * *******************************************/ int choose_int_points(int 11) { int ii,i,j,ipos,jpos,used_so_far,rig; double maxi,sss,expscore; recc temp; for (i=0;i<q;i++) for (j=0;j<q;j++) A[i][j]=0; soft_sudan_score=0; expscore=0; used_so_far=0; while (used_so_far<11) (for (i=0;(((AA[i].mult+used_so_far)>11) && (i<q*q));i++); if (i<q*q) { AA[i].mult=AA[i].mult+1; expscore=expscore+AA[i].p; used_so_far=AA[i].mult+used_so_far; if (cword[AA[i].x]==AA[i].y) {soft_sudan_score++;} AA[i].p=AA[i].p*AA[i].mult/(AA[i].mult+1); temp=AA[i]; while ((temp.p<AA[i+1].p) && (i+2<q*q)) { AA[i]=AA[i+1]; i=i+1; } AA[i]=temp; } } return(soft_sudan_score); } /******************************************* *   auxiliary procedures           * *******************************************/ double sq(double x) (return(x*x);}; static int reccompare(recc *a,recc *b) (if (((*a).p/((*a).mult+1)) > ((*b).p/((*b).mult+1))) return (−1); if (((*a).p/((*a).mult+1)) < ((*b).p/((*b).mult+1))) return (1); return(0);} 

What is claimed is:
 1. A method of decoding an error-correcting code comprising a plurality of codewords, each codeword having multiple symbols, the error-correcting code being of the Reed-Solomon type and/or of the Bose-Chaudhuri-Hocquenghem (BCH) type, the method of decoding comprising the steps of: 1) converting reliability information concerning individual code symbols into algebraic interpolation conditions; 2) finding a non-trivial polynomial that satisfies said interpolation conditions; and 3) generating a list of candidate codewords of the Reed-Solomon code or the BCH code.
 2. The method according to claim 1 wherein the reliability information comprise reliability information concerning the relative likelihoods of individual code symbols, and/or, equivalently, data from which the reliability information concerning the relative likelihoods of individual code symbols can be computed, and wherein the method of decoding comprises: 1) given the reliability information, computing from this reliability information a set of algebraic interpolation conditions; 2) given the algebraic interpolation conditions, interpolating to find a non-trivial interpolation polynomial that satisfies these conditions; and 3) given the interpolation polynomial, factoring the polynomial to find factors that correspond to codewords of the Reed-Solomon code or the BCH code, therein generating a list of candidate codewords.
 3. The method according to claim 2, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented in special purpose digital logic.
 4. The method according to claim 2, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented by software in a general purpose digital computing device, including a digital signal processing (DSP) integrated circuit or a digital computer.
 5. The method according to claim 2 wherein the error-correcting code comprises a coding scheme taken from the group consisting of concatenated coding schemes, multilevel coding schemes, and combinations thereof.
 6. The method according to claim 2 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a communication system.
 7. The method according to claim 2 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a recording system.
 8. The method according to claim 2, wherein a coding gain provided by the method of decoding is traded-off for a computational complexity of the method of decoding, relatively more coding gain being realized at relatively more computational complexity while relatively less coding gain is realized at relatively less computational complexity.
 9. The method according to claim 2 further comprising the step, after the step of 3) factoring: 4) selecting one or more codeword(s) from the list of candidate codewords generated in the 3) factoring, thereby forming a subset of selected codewords.
 10. The method according to claim 1 wherein the error-correcting code comprises an (n,k,d) Reed-Solomon code over a finite field _(q) with q elements, or an (n,k′,d′) BCH code over a finite field _(q′) with q′ elements, where n≧1 is the length, k≦n is the dimension, and d=n−k+1 is the minimum distance of the Reed-Solomon code, and where n is the length, k′≦k is the dimension, and d′≦d is the minimum distance of the BCH code, the BCH code being a subfield subcode of the Reed-Solomon code, wherein the reliability information comprises, for each of the multiple symbols, reliability information concerning the relative likelihood of each code symbol being equal to α∈_(q), for each element α of the finite field _(q), and/or, equivalently, data from which such reliability information can be computed, and wherein the method of decoding comprises: 1) given the reliability information, computing from this information non-negative integer-valued numbers m_(i,j) for each position j in the code, where 1≦j≦n, and for each element α_(i)∈_(q), where 1≦i≦q, so that the total number of non-zeros among the non-negative integer-valued numbers m_(i,j) depends on the reliability information, the total number of non-zeros being possibly greater than n; 2) given the numbers m_(i,j), interpolating to find a (non-trivial) polynomial Q_(M)(X,Y) that has a zero of multiplicity at least m_(i,j) at the point (x_(j),α_(i)) for each j in the range 1≦j≦n and each i in the range 1≦i≦q, the polynomial Q_(M)(X,Y) best selected to have the least possible (1,k−1)-weighted degree; and 3) given the polynomial Q_(M)(X,Y), factoring this polynomial to find factors of type Y−ƒ(X), where the degree of the polynomial ƒ(X) is less than k, each factor corresponding to a codeword of the Reed-Solomon code or of the BCH code, thereby generating a list of candidate codewords; wherein each of the 1) computing and the 2) interpolating and the 3) factoring has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the Reed-Solomon code and/or the BCH code; and wherein the entire decoding method thus has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the Reed-Solomon code and/or the BCH code.
 11. The method according to claim 10 wherein the reliability information further comprises a positive integer s, which is an arbitrary design parameter indicating the total number of interpolation points and governing the computational complexity of the entire decoding method; and wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: 1.1) computing from said input data an initial set of numbers π_(i,j), where each number π_(i,j) suitably reflects the relative likelihood of the j-th code symbol being equal to α_(i)∈_(q), for each integer j in the range 1≦j≦n and for each element α_(i) of the finite field _(q), where i is an integer in the range 1≦i≦q; 1.2) initializing so as to set m_(i,j)=0 and π*_(i,j)=π_(i,j) for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n; 1.3) finding the index (i₀,j₀) of a number π*_(i) ₀ _(,j) ₀ so that π*_(i) ₀ _(,j) ₀ ≧π*_(i,j)  for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n; 1.4) deriving $\left. \pi_{i_{0}j_{0}}^{*}\leftarrow\frac{\pi_{i_{0}j_{0}}}{m_{i_{0}j_{0}} + 2} \right.$ m_(i₀j₀) ← m_(i₀j₀) + 1 s ← s − 1

1.5) deciding if s=0, and if so then outputting the non-negative integer-valued numbers m_(i,j) for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n, else returning to performing the 1.3) finding and the 1.4) deriving and the 1.5) deciding.
 12. The method according to claim 11, wherein the computing of non-negative integer-valued numbers m_(i,j) continues until a stopping criterion is reached, which stopping criterion is based on the computational-complexity design parameter s being preset to an arbitrary fixed positive integer value.
 13. The method according to claim 10, wherein the computing of non-negative integer-valued numbers m_(i,j) is contingent upon a stopping criterion that is adaptively set during the computing using the results of the computing.
 14. The method according to claim 10, wherein the step of 3) factoring comprises factoring to yield a correct codeword c among the list of candidate codewords if S _(M)(c)>Δ(C)  where S_(M)(c) is a particular function, called a score, of the codeword c with respect to the non-negative integer-valued numbers m_(i,j) generated by the 1) computing, C is the total cost of the non-negative integer-valued numbers m_(i,j), and Δ(C)<{square root over (2kC)} is the least non-negative integer such that the total number of bivariate monomials of (1,k−1)-weighted degree at most Δ(C) is greater than C.
 15. A method of decoding an algebraic-geometric error-correcting code comprising a plurality of codewords, each codeword having multiple symbols, the method of decoding comprising the steps of: 1) converting reliability information concerning individual code symbols into algebraic interpolation conditions; 2) finding a non-trivial polynomial that satisfies said interpolation conditions; and 3) generating a list of candidate codewords of the algebraic-geometric code.
 16. The method according to claim 15 wherein the reliability information comprises reliability information concerning the relative likelihoods of individual code symbols, and/or, equivalently, data from which the reliability information concerning the relative likelihoods of individual code symbols can be computed, and wherein the method of decoding comprises: 1) given the reliability information, computing from this reliability information a set of algebraic interpolation conditions, 2) given the algebraic interpolation conditions, interpolating to find a non-trivial interpolation polynomial that satisfies these conditions; and 3) given the interpolation polynomial, factoring the polynomial to find factors that correspond to codewords of the algebraic-geometric code, therein generating a list of candidate codewords.
 17. The method according to claim 16, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented in special purpose digital logic.
 18. The method according to claim 16, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented by software in a general purpose digital computing device, including a digital signal processing (DSP) integrated circuit or a digital computer.
 19. The method according to claim 16 wherein the error-correcting code comprises a coding scheme taken from the group consisting of concatenated coding schemes, multilevel coding schemes, and combinations thereof.
 20. The method according to claim 16 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a communication system.
 21. The method according to claim 16 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a recording system.
 22. The method according to claim 16, wherein a coding gain provided by the method of decoding is traded-off for a computational complexity of the method of decoding, relatively more coding gain being realized at relatively more computational complexity while relatively less coding gain is realized at relatively less computational complexity.
 23. The method according to claim 16 further comprising the step, after the step of 3) factoring: 4) selecting one or more codeword(s) from the list of candidate codewords generated in the 3) factoring, thereby forming a subset of selected codewords.
 24. The method according to claim 15 wherein the error-correcting code comprises an (n,k,d) algebraic-geometric code over a finite field _(q) with q elements, where n≧1 is the length, k≦n is the dimension, and d≦n−k+1 is the minimum distance of the algebraic-geometric code, wherein the reliability information comprises, for each of the multiple symbols, reliability information concerning the relative likelihood of each code symbol being equal to α∈_(q), for each element α of the finite field _(q), and/or, equivalently, data from which such reliability information can be computed, and wherein the method of decoding comprises: 1) given the reliability information, computing from this information non-negative integer-valued numbers m_(i,j) for each position j in the code, where 1≦j≦n, and for each element α_(i)∈_(q), where 1≦i≦q, so that the total number of non-zeros among the non-negative integer-valued numbers m_(i,j), depends on the reliability information, the total number of non-zeros being possibly greater than n; 2) given the numbers m_(i,j), interpolating to find a (non-trivial) polynomial Q_(M)(Y) over the ring K_(Q)[Y] that has a zero of multiplicity at least m_(i,j) at the point (P_(j),α_(i)) for each j in the range 1≦j≦n and each i in the range 1≦i≦q, where for each j in the range 1≦j≦n, P_(j)≠Q is a point on the algebraic curve used to construct the algebraic-geometric code, and where the polynomial Q_(M)(Y) is best selected to have the least possible (1,m)-weighted Q-valuation, Q being a fixed point on the algebraic curve distinct from the points P₁,P₂, . . . , P_(n), and m being the highest pole order at Q of rational functions (on the algebraic curve) that are evaluated in the construction of the algebraic-geometric code; and 3) given the polynomial Q_(M)(Y), factoring this polynomial to find factors of type Y−ƒ, where ƒ is a rational function whose highest pole order at Q is at most m, each such factor corresponding to a codeword of the algebraic-geometric code, thereby generating a list of candidate codewords; wherein each of the 1) computing and the 2) interpolating and the 3) factoring has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the algebraic-geometric code; and wherein the entire decoding method thus has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the algebraic-geometric code.
 25. The method according to claim 24 wherein the reliability information further comprises a positive integer s, which is an arbitrary design parameter indicating the total number of interpolation points and governing the computational complexity of the entire decoding method; and wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: 1.1) computing from said input data an initial set of numbers π_(i,j), where each number π_(i,j) suitably reflects the relative likelihood of the j-th code symbol being equal to α_(i)∈_(q), for each integer j in the range 1≦j≦n and for each element α_(i) of the finite field _(q), where i is an integer in the range 1≦j≦q; 1.2) initializing so as to set m_(i,j)=0 and π*_(i,j)=π_(i,j) for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n; 1.3) finding the index (i₀,j₀) of a number π*_(i) ₀ _(,j) ₀ so that π*_(i) ₀ _(,j) ₀ ≧π*_(i,j)  for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n; 1.4) deriving $\left. \pi_{i_{0}j_{0}}^{*}\leftarrow\frac{\pi_{i_{0}j_{0}}}{m_{i_{0}j_{0}} + 2} \right.$ m_(i₀j₀) ← m_(i₀j₀) + 1 s ← s − 1

1.5) deciding if s=0, and if so then outputting the non-negative integer-valued numbers m_(i,j) for all integers i in the range 1≦i≦q and for all integers j in the range 1≦j≦n, else returning to performing the 1.3) finding and the 1.4) deriving and the 1.5) deciding.
 26. The method according to claim 24, wherein the computing of non-negative integer-valued numbers m_(i,j) continues until a stopping criterion is reached, which stopping criterion is based on the computational-complexity design parameter s being preset to an arbitrary fixed positive integer value.
 27. The method according to claim 24, wherein the computing of non-negative integer-valued numbers m_(i,j) is contingent upon a stopping criterion that is adaptively set during the computing using the results of the computing.
 28. The method according to claim 24, wherein the step of 3) factoring comprises factoring to yield a correct codeword c among the list of candidate codewords if S _(M)(c)>Δ(C)  where S_(M)(c) is a particular function, called a score, of the codeword c with respect to the non-negative integer-valued numbers m_(i,j) generated by the 1) computing, C is the total cost of the non-negative integer-valued numbers m_(i,j), and Δ_(X)(C)<g+{square root over (2m(C+g)+g ²)}  is the least non-negative integer such that the total number of different non-zero expressions of type φ_(i)Y^(j) with (1,m)-weighted Q-valuation at most Δ_(X)(C) is greater than C, and where g is the genus of the algebraic curve used in the construction of the algebraic-geometric code.
 29. A method of decoding a Reed-Solomon error-correcting code of length n and rate R over a finite field _(q) with q elements or a Bose-Chaudhuri-Hocquenghem (BCH) error-correcting code of length n and rate R′ over the finite field _(q) on a channel of type drawn from the group consisting of q-ary symmetric channels, q-ary symmetric erasure channels, and q-PSK channels, where input to the decoding method consists of data that includes: a vector y=(y₁,y₂, . . . , y_(n)) observed at the channel output, where all the n elements of the vector y belong to the finite field _(q) in the case of q-ary symmetric channels and q-PSK channels, whereas in the case of q-ary symmetric erasure channels, all the n elements of the vector y belong to _(q)∪{φ}, where the special symbol φ denotes erasure, the decoding method comprising: 1) given the vector y=(y₁,y₂, . . . , y_(n)), computing from this vector y a set of algebraic interpolation conditions that specify values of an interpolation polynomial; 2) given the algebraic interpolation conditions interpolating to find a (non-trivial) interpolation polynomial that satisfies these conditions; and 3) given the interpolation polynomial, factoring this polynomial to find factors that correspond to codewords of the Reed-Solomon code or the BCH code, thereby generating a list of candidate codewords.
 30. The method according to claim 29 wherein the error-correcting code comprises a Reed-Solomon code or a BCH code on a q-ary symmetric channel, or a q-ary symmetric erasure channel, or a q-PSK channel, the decoding method comprising: 1) given the vector y=(y₁,y₂, . . . , y_(n)), computing from this vector y non-negative integer-valued numbers m_(i,j) for each positions j in the code, where 1≦j≦n, and for each element as α_(i)∈_(q), where 1≦i≦q, the total number of non-zeros among the non-negative numbers m_(i,j) being possibly greater than n; 2) given the numbers m_(i,j), interpolating to find a (non-trivial) polynomial Q_(M)(X,Y) that has a zero of multiplicity at least m_(i,j) at the point (x_(j),α_(i)) for each j in the range 1≦j≦n and each i in the range 1≦i≦q, the polynomial Q_(M)(X,Y) best selected to have the least possible (1,nR−1)-weighted degree; and 3) given the polynomial Q_(M)(X,Y), factoring this polynomial to find factors of type Y−ƒ(X), where the degree of the polynomial ƒ(X) is less than nR, each such factor corresponding to a codeword of the Reed-Solomon code or the BCH code, therein generating a list of candidate codewords, wherein each of the 1) computing and the 2) interpolating and the 3) factoring has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the Reed-Solomon code and/or the BCH code, and wherein the entire decoding method thus has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the Reed-Solomon code and/or the BCH code.
 31. The method according to claim 30 wherein the error-correcting code comprises a Reed-Solomon code or a BCH code on a q-ary symmetric channel, and wherein the 1) computing of non-negative integer-valued numbers m_(i,j), comprises: for each integer j in the range 1≦j≦n, setting m_(i,j)=a if y_(j)=α_(i) and m_(i,j)≠a otherwise, where y_(j) is the j-th element of the observed vector y and α_(i) is the i-th element of the finite field _(q), and where a is an appropriately chosen positive integer; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the decoding method produces a list of candidate codewords that contains the particular codeword at the input to the q-ary symmetric channel, provided that less than t errors have occurred, further provided that the fraction τ=t/n satisfies the inequality: ${\left( {1 - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \geq R$

 where q is the size of the alphabet of the q-ary symmetric channel and R is the rate of the Reed-Solomon code being decoded or, in the case of decoding BCH codes, the rate of the Reed-Solomon code that underlies the BCH code being decoded.
 32. The decoding method according to claim 30 wherein the error-correcting code comprises a Reed-Solomon code or a BCH code on a q-ary symmetric erasure channel, and wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: for each integer j in the range 1≦j≦n, setting m_(i,j)=a if y_(j)=α_(i), m_(i,j)=b if y_(j)=φ, and m_(i,j)∉{a,b} otherwise, where y_(j) is the j-th element of the observed vector y while α_(i) is the i-th clement of the finite field _(q) and φ is a special symbol denoting erasure, and where a,b are appropriately chosen positive integers; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the decoding method produces a list of candidate codewords that contains the particular codeword at the input to the q-ary symmetric erasure channel, provided that less than t errors have occurred, further provided that the fraction τ=t/n satisfies the inequality: ${{\frac{1}{1 - \sigma}\left( {\left( {1 - \sigma - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \right)} + \frac{\sigma}{q}} \geq R$

 where q is the size of the alphabet of the q-ary symmetric erasure channel, not counting the symbol φ denoting erasure, σ is the number of erasures that occurred, and R is the rate of the Reed-Solomon code being decoded or, in the case of decoding BCH codes, the rate of the Reed-Solomon code that underlies the BCH code being decoded.
 33. The decoding method according to claim 30 wherein the error-correcting code comprises a Reed-Solomon code or a BCH code on a q-PSK channel, and wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: for each integer j in the range 1≦i≦n, setting m_(i,j)=a if y_(j) is the QPSK symbol that corresponds to α_(i), m_(i,j)=b if y_(j) is a nearest neighbor of the QPSK symbol that corresponds to α_(i), and m_(i,j)∉{a,b} otherwise, where y_(j) is the j-th element of the observed vector y and α_(i) is the i-th element of the finite field _(q), and where a,b are appropriately chosen positive integers; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the method produces a list of candidate codewords that contains the particular codeword at the input to the q-PSK channel, provided that less than s errors to the nearest neighbors and less than t errors beyond the nearest neighbors have occurred, further provided that the fractions σ=s/n and τ=t/n satisfy the inequality: ${\left( {1 - \tau - \sigma} \right)^{2} + \frac{r^{2}}{q - 3} + \frac{\sigma^{2}}{2}} \geq R$

 where q is the size of the alphabet of the q-PSK channel and R is the rate of the Reed-Solomon code being decoded or, in the case of decoding BCH codes, the rate of the Reed-Solomon code that underlies the BCH code being decoded.
 34. The method according to claim 29, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented in special purpose digital logic.
 35. The method according to claim 29, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented by software in a general purpose digital computing device, including a digital signal processing (DSP) integrated circuit or a digital computer.
 36. The method according to claim 29 wherein the error-correcting code comprises a coding scheme taken from the group consisting of concatenated coding schemes, multilevel coding schemes, and combinations thereof.
 37. The method according to claim 29 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a communication system.
 38. The method according to claim 29 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a recording system.
 39. The method according to claim 29 further comprising the step, after the step of 3) factoring: 4) selecting one or more codeword(s) from the list of candidate codewords generated in the 3) factoring, thereby forming a subset of selected codewords.
 40. A method of decoding an algebraic-geometric error-correcting code of length n, dimension k≦n, and rate R=k/n, over a finite field _(q) with q elements, the algebraic-geometric code being constructed on an algebraic curve of genus g, the method of decoding being applied to decode the algebraic-geometric code on a channel of type drawn from the group consisting of q-ary symmetric channels, q-ary symmetric erasure channels, and q-PSK channels, p1 where input to the decoding method consists of data that includes: a vector y=(y₁,y₂, . . . , y_(n)) observed at the channel output, where all the n elements of the vector y belong to the finite field _(q) in the case of q-ary symmetric channels and q-PSK channels, whereas in the case of q-ary symmetric erasure channels, all the n elements of the vector y belong to _(q)∪{φ}, where the special symbol φ denotes erasure, the decoding method comprising: 1) given the vector y=(y₁,y₂, . . . , y_(n)), computing from this vector y a set of algebraic interpolation conditions that specify values of an interpolation polynomial; 2) given the algebraic interpolation conditions, interpolating to find a (non-trivial) interpolation polynomial that satisfies these conditions; and 3) given the interpolation polynomial, factoring this polynomial to find factors that correspond to codewords of the algebraic-geometric code, thereby generating a list of candidate codewords.
 41. The method according to claim 40 wherein the error-correcting code comprises an algebraic-geometric code on a q-ary symmetric channel, or a q-ary symmetric erasure channels, or a q-PSK channel, the decoding method comprising: 1) given the vector y=(y₁,y₂, . . . , y_(n)), computing from this vector y non-negative integer-valued numbers m_(i,j) for each position j in the code, where 1≦j≦n, and for each element α_(j)∈_(q), where 1≦i≦q, the total number of non-zeros among the non-negative numbers m_(i,j) may being possibly greater than n; 2) given the numbers m_(i,j) interpolating to find a (non-trivial) polynomial Q_(M)(Y) over the ring K_(Q)[Y] that has a zero of multiplicity at least m_(i,j) at the point (P_(j),α_(i)) for each j in the range 1≦j≦n and each i in the range 1≦i≦q, where for each j in the range 1≦j≦n, P_(j)≠Q is a point on the algebraic curve used to construct the algebraic-geometric code, and where the polynomial Q_(M)(Y) is best selected to have the least possible (1,m)-weighted Q-valuation, Q being a fixed point on the algebraic curve distinct from the points P₁,P₂, . . . , P_(n), and m being the highest pole order at Q of rational unctions (on the algebraic curve) that are evaluated in the construction of the algebraic-geometric code, and 3) given the polynomial Q_(M)(Y), factoring this polynomial to find factors of type Y−ƒ, where ƒ is a rational function whose highest pole order at Q is at most m, each such factor corresponding to a codeword of the algebraic-geometric code, therein generating a list of candidate codewords; wherein each of the 1) computing and the 2) interpolating and the 3) factoring has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the algebraic-geometric code; and wherein the entire decoding method thus has a computational complexity that is always bounded by a polynomial function of the length n of a codeword of the algebraic-geometric code.
 42. The decoding method according to claim 41 wherein the error-correcting code comprises an algebraic-geometric code on a q-ary symmetric channel, wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: for each integer j in the range 1≦j≦n, setting m_(i,j)=a if y_(j)=α_(i) and m_(i,j)≠a otherwise, where y_(j) s the j-th element of the observed vector y and α_(i) is the i-th element of the finite field _(q), and where a is an appropriately chosen positive integer; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the decoding method produces a list of candidate codewords that contains the particular codeword at the input to the q-ary symmetric channel, provided that less than t errors have occurred, further provided that the fraction τ=t/n satisfies the inequality: ${\left( {1 - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \geq {R + \frac{g}{n}}$

 where q is the size of the alphabet of the q-ary symmetric channel, R is the rate of the algebraic-geometric code, and g is the genus of the algebraic curve that is used in the construction of the algebraic-geometric code.
 43. The decoding method according to claim 41 wherein the error-correcting code comprises an algebraic-geometric code on a q-ary symmetric erasure channel, wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: for each integer j in the range 1≦j≦n, setting m_(i,j)=a if y_(j)=α_(i), m_(i,j)=b if y_(j)=φ, and m_(i,j)∉{a,b} otherwise, where y_(j) is the j-th element of the observed vector y while α_(i) is the i-th element of the finite field _(q) and φ is a special symbol denoting erasure, and where a,b are appropriately chosen positive integers; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the decoding method produces a list of candidate codewords that contains the particular codeword at the input to the q-ary symmetric erasure channel, provided that less than t errors have occurred, further provided that the fraction τ=t/n satisfies the inequality ${{\frac{1}{1 - \sigma}\left( {\left( {1 - \sigma - \tau} \right)^{2} + \frac{\tau^{2}}{q - 1}} \right)} + \frac{\sigma}{q}} \geq {R + \frac{g}{n}}$

 where q is the size of the alphabet of the q-ary symmetric erasure channel, not counting the symbol φ denoting erasure, σ is the number of erasures that occurred, R is the rate of the algebraic-geometric code, and g is the genus of the algebraic curve that is used in the construction of the algebraic-geometric code.
 44. The decoding method according to claim 41 wherein the error-correcting code comprises an algebraic-geometric code on a q-PSK channel, wherein the 1) computing of non-negative integer-valued numbers m_(i,j) comprises: for each integer j in the range 1≦j≦n, setting m_(i,j)=a if y_(j) is the QPSK symbol that corresponds to α_(i), m_(i,j)=b if y_(j) is a nearest neighbor of the QPSK symbol that corresponds to α_(i), and m_(i,j)∉{a,b} otherwise, where y_(j) is the j-th element of the observed vector y and α_(i) is the i-th element of the finite field _(q), and where a,b are appropriately chosen positive integers; wherein, as the number of interpolation points in the 2) interpolating becomes very large, the method produces a list of candidate codewords that contains the particular codeword at the input to the Q-PSK channel, provided that less than s errors to the nearest neighbors and less than t errors beyond the nearest neighbors have occurred, further provided that the fractions σ=s/n and τ=t/n satisfy the inequality: ${\left( {1 - \tau - \sigma} \right)^{2} + \frac{\tau^{2}}{q - 3} + \frac{\sigma^{2}}{2}} \geq {R + \frac{g}{n}}$

 where q is the size of the alphabet of the q-PSK channel, R is the rate of the algebraic-geometric code, and g is the genus of the algebraic curve that is used in the construction of the algebraic-geometric code.
 45. The method according to claim 40, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented in special purpose digital logic.
 46. The method according to claim 40, wherein at least one of the 1) computing, the 2) interpolating, and the 3) factoring is implemented by software in a general purpose digital computing device, including a digital signal processing (DSP) integrated circuit or a digital computer.
 47. The method according to claim 40 wherein the error-correcting code comprises a coding scheme taken from the group consisting of concatenated coding schemes, multilevel coding schemes, and combinations thereof.
 48. The method according to claim 40 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a communication system.
 49. The method according to claim 40 wherein the step of converting reliability information comprises the step of converting reliability information concerning individual code symbols used in a recording system.
 50. The method according to claim 40 further comprising the step, after the step of 3) factoring: 4) selecting one or more codeword(s) from the list of candidate codewords generated in the 3) factoring, thereby forming a subset of selected codewords. 